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PATENT 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Application of ) 

) Group Art Unit: T.B.A. 
ALBERT SON et al. ) 

) Examiner: T.B.A. 
Serial No. T.B.A. ) 

Filed: even herewith ) Atty.Dkt.No. 01107.78817 

For: APC Antibodies 

AMENDMENT UNDER 37 C.F.R. § 1.121(b) 

Assistant Commissioner of Patents 
Washington, D.C. 20231 

Sir: 

Please enter the following amendments to the resisue application referenced above. We 
believe no fee is due in connection with this amendment. If a fee is due, please charge Deposit 
Account No. 19-0733. 
IN THE SPECIFICATION 

At column 3, line 20: 

In even another embodiment a preparation of the human APC protein is provided which is 
substantially free of other human proteins. The amino acid sequence of the protein is shown in [FIG. 
3] FIGS. 3A-3Z (SEQ ID NOS: 7 and 2). 

At column 4, line 26: 

[FIGS. 3A-3F] FIGS. 3A-3Z show the sequence of the APC gene product (SEQ ID NO: 
7). The cDNA sequence was determined through the analysis of 87 cDNA clones derived from 
normal colon, liver, and brain. A total of 8973 bp were contained within overlapping cDNA 



clones, defining an ORF of [2842] 2843 amino acids. In frame stop codons surrounded this 
ORF, as described in the text, suggesting that the entire APC gene product was represented in the 
ORF illustrated. Only the predicted amino acids are shown. 
At column 6, line 30: 

Alteration of wild-type genes can also be detected on the basis of the alteration of a 
wild-type expression product of the gene. Such expression products include both the APC mRNA 
as well as the APC protein product. The sequences of these products are shown in [FIG. 3] FIGS. 
3A-3Z. Point mutations may be detected by amplifying and sequencing the mRNA or via 
molecular cloning of cDNA made from the mRNA. The sequence of the cloned cDNA can be 
determined using DNA sequencing techniques which are well known in the art. The cDNA can 
also be sequenced via the polymerase chain reaction (PCR) which will be discussed in more detail 
below. 

At column 8, line 32: 

In order to facilitate subsequent cloning of amplified sequences, primers may have 
restriction enzyme site sequences appended to their 5* ends. Thus, all nucleotides of the primers 
are derived from APC sequences or sequences adjacent to APC except the few nucleotides 
necessary to form a restriction enzyme site. Such enzymes and sites are well known in the art. 
The primers themselves can be synthesized using techniques which are well known in the art. 
Generally, the primers can be made using oligonucleotide synthesizing machines which are 
commercially available. Given the sequence of the APC open reading frame shown in [FIG. 3] 
FIGS, 3A-3Z (SEQ ID NO: 1), design of particular primers is well within the skill of the art. 

At column 10, line 39: 
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Polypeptides which have APC activity can be supplied to cells which carry mutant or 
missing APC alleles. The sequence of the APC protein is disclosed in [FIG. 3] FIGS. 3A-3Z 
(SEQ ID NO:7). [These two sequences differ slightly and appear to be indicate the existence of 
two different forms of the APC protein.] Protein can be produced by expression of the cDNA 
sequence in bacteria, for example, using known expression vectors. Alternatively, APC can be 
extracted from APC-producing mammalian cells such as brain cells. In addition, the techniques 
of synthetic chemistry can be employed to synthesize APC protein. Any of such techniques can 
provide the preparation of the present invention which comprises the APC protein. The 
preparation is substantially free of other human proteins. This is most readily accomplished by 
synthesis in a microorganism or in vitro . 

At column 10, line 66: 

A short region of homology has been identified between APC and the human m3 muscarinic 
acetylcholine receptor (mAChR). This homology was largely confined to 29 residues in which 6 out 
of 7 amino acids (EL(GorA)GLQA) were identical (See [FIG. 4] FIG. 4B (SEQ ID NO: 9)). 
Initially, it was not known whether this homology was significant, because many other proteins 
had higher levels of global homology (though few had six out of seven contiguous amino acids 
in common). However, a study on the sequence elements controlling G protein activation by 
mAChR subtypes (Lechleiter et al., EMBO J., p. 4381 (1990)) has shown that a 21 amino acid 
region from the m3 mAChR completely mediated G protein specificity when substituted for the 
21 amino acids of m2 mAChR at the analogous protein position. These 21 residues overlap the 
19 amino acid homology between APC and m3 mAChR. 

At column 13 , line 1: 
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Contig 2: TBI - TBI was identified through a cross-hybridization approach. Exons of 
genes are often evolutionarily conserved while introns and intergenic regions are much less 
conserved. Thus, if a human probe cross-hybridizes strongly to the DNA from non-primate 
species, there is a reasonable chance that it contains exon sequences. Subclones of the cosmids 
shown in [FIG. 1] FIGS. !A r 1B-1. !B-2 r and 1B-3 were used to screen Southern blots containing 
rodent DNA samples. A subclone of cosmid N5.66 (p 5.66-4) was shown to strongly hybridize 
to rodent DNA, and this clone was used to screen cDNA libraries derived from normal adult colon 
and fetal liver. The ends of the initial cDNA clones obtained in this screen were then used to 
extend the cDNA sequence. Eventually, 11 cDNA clones were isolated, covering 2314 bp. The 
gene detected by these clones was named TBI. Sequence analysis of the overlapping clones 
revealed an open reading frame (ORF) that extended for 1302 bp starting from the most 5' 
sequence data obtained (FIG. 2A). If this entire open reading frame were translated, it would 
encode 434 amino acids (SEQ ID NO: 5). The product of this gene was not globally homologous 
to any other sequence in the current database but showed two significant local similarities to a 
family of ADP, ATP carrier/translocator proteins and mitochondrial brown fat uncoupling 
proteins which are widely distributed from yeast to mammals. These conserved regions of TBI 
(underlined in FIG. 2A) may define a predictive motif for this sequence family. In addition, TBI 
appeared to contain a signal peptide (or mitochondrial targeting sequence) as well as at least 7 
transmembrane domains. 

At column 14, line 38: 

Sequence analysis of the APC cDNA clones revealed an open reading frame of 8,535 
nucleotides. The 5 ! end of the ORF contained a methionine codon (codon 1) that was preceded 
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by an in-frame stop codon 9 bp upstream, and the 3' end was followed by several in-frame stop 
codons. The protein produced by initiation at codon 1 would contain [2,842] 2843 amino acids 
(SEQ ID NO: 7) [(FIG. 3)] FIG, 3A-3Z. The results of database searching with the APC gene 
product were quite complex due to the presence of large segments with locally biased amino acid 
compositions. In spite of this, APC could be roughly divided into two domains. The N-terminal 
25% of the protein had a high content of leucine residues (12%) and showed local sequence 
similarities to myosins, various intermediate filament proteins (e.g., desmin, vimentin, 
neurofilaments) and Drosophila armadillo/human plakoglobin. The latter protein is a component 
of adhesive junctions (desmosomes) joining epithelial cells (Franke et al., Proc. Natl. Acad. Sci. 
U.S.A., Vol. 86, p. 4027 (1989); Perfer et al., Cell, Vol. 63, p. 1167 (1990)). The C-terminal 
75% of APC (residues 731-2832) is 17% serine by composition with serine residues more or less 
uniformly distributed. This large domain also contains local concentrations of charged (mostly 
acidic) and proline residues. There was no indication of potential signal peptides, transmembrane 
regions, or nuclear targeting signals in APC, suggesting a cytoplasmic localization. 
At column 26, line 27: 

To obtain DNA sequence adjacent to the exons of the genes DPI, DP2.5, and SRP19, 
sequencing substrate was obtained by inverse PCR amplification of DNAs from two YACs, 
310D8 and 183H12, that span the deletions. Ligation at low concentration cyclized the restriction 
enzyme-digested YAC DNAs. Oligonucleotides with sequencing tails, designed in inverse 
orientation at intervals along the cDNAs, primed PCR amplification from the cyclized templates. 
Comparison of these DNA sequences with the cDNA sequences placed exon boundaries at the 
divergence points. SRP19 and DPI were each shown to have five exons. DP2.5 consisted of 15 
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exons. The sequences of the oligonucleotides synthesized to provide PCR amplification primers 
for the exons of each of these genes are listed in Table m [SEQ ID NOS: 39-94] (SEOIDNOS 
39-94) . With the exception of exons 1, 3, 4, 9, and 15 of DP2.5 (see below), the primer 
sequences were located in intron sequences flanking the exons. The 5' primer of exon 1 is 
complementary to the cDNA sequence, but extends just into the 5' Kozak consensus sequence for 
the initiator methionine, allowing a survey of the translated sequences. The 5* primer of exon 
3 is actually in the 5* coding sequences of this exon, as three separate intronic primers simply 
would not amplify. The 5' primer of exon 4 just overlaps the 5' end of this exon, and we thus 
fail to survey the 19 most 5' bases of this exon. For exon 9, two overlapping primer sets were 
used, such that each had one end within the exon. For exon 15, the large 3' exon of DP2.5, 
overlapping primer pairs were placed along the length of the exon; each pair amplified a product 
of 250-400 bases. 

At column 29, line 1: 

The sequences of the unique conformers from exons 7, 8, 10, and 11 of DP2.5 revealed 
dramatic mutations in the DP2.5 gene. The sequence of the new mutation creating the exon 7 
conformer in patient 3746 was shown to contain a deletion of two adjacent nucleotides, at 
positions 730 and 731 in the cDNA sequence ([FIG. 7,] SEQ ID NO:l). The normal sequence 
at this splice junction is £A£GGTCA (intronic sequence underlined), with the intron-exon 
boundary between the two repetitions of AG. The mutant allele in this patient has the sequence 
CAGGTCA. Although this change is at the 5' splice site, comparison with known consensus 
sequences of splice junctions would suggest that a functional splice junction is maintained. If this 
new splice junction were functional, the mutation would introduce a frameshift that creates a stop 
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codon 15 nucleotides downstream. If the new splice junction were not functional, messenger 
processing would be significantly altered. 
At column 29, line 26: 

The unique conformer found in exon 8 of patient 3460 was found to carry a C-T transition, 
at position 904 in the cDNA sequence of DP2.5 [(shown in FIG. 7)], which replaced the normal 
sequence of CGA with TGA This point mutation, when read in frame, results in a stop codon 
replacing the normal arginine codon. This single-base change had occurred within the context of a 
CG dimer, a potential hot spot for mutation (Barker et al., 1984). 

At column 30, line 37: 

The continuity of the very large (6.5 kb), most 3' exon in DP2.5 was shown in two ways. 
First, inverse PCR with primers spanning the entire length of this exon revealed no divergence of the 
cDNA sequence from the genomic sequence. Second, PCR amplification with converging primers 
placed at intervals along the exon generated products of the same size whether amplified from the 
originally isolated cDNA, cDNA from various tissues, or genomic template. Two forms of exon 9 
were found in DP2.5: one is the complete exon; and the other, labeled exon 9 A, is the result of a 
splice into the interior of the exon that deletes bases 934 to 1236 in the mRNA and removes 101 
amino acids from the predicted protein (see [FIG. 3] FIGS. 3A-3Z, SEQ ID NOS: 1 & 2 ). 

At column 31, line 30: 

The cDNA consensus sequence of APC predicts that the longer, more abundant form of 
the message codes for a [2842 or 2844] 2843 amino acid peptide with a mass of 311.8 kd. This 
predicted APC peptide was compared with the current data bases of protein and DNA sequences 
using both Intelligenetics and GCG software packages. No genes with a high degree of amino 
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acid sequence similarity were found. Although many short (approximately 20 amino acid) regions 
of sequence similarity were uncovered, none was sufficiently strong to reveal which, if any, might 
represent functional homology. Interestingly, multiple similarities to myosins and keratins did 
appear. The APC gene also was scanned for sequence motifs of known function; although 
multiple glycosylation, phosphorylation, and myristoylation sites were seen, their significance is 
uncertain. 

At columns 31-132: 

Please delete the sequence listing and replace it with the enclosed substitute sequence 
listing. The substitute sequence listing is identical to the sequence listing in the patent with the 
exception of one amino acid in SEQ ID NO:7. The substitute sequence listing contains a proline 
at position 173. 

Remarks 

The specification has been amended to correct the number of amino acids said to be present 
in the APC protein. This correction is supported in Figure 3 and in SEQ ID NOS: 1 and 2, each of 
which show a 2843 amino acid APC protein. 

The sequence listing has been amended to correct the amino acid sequence of the APC protein 
shown in SEQ ID NO:7, by insertion of a proline at position 173 of SEQ ID NO:7. This amendment 
is supported in the issued patent in Figure 3 and in SEQ ID NOS: 1 and 2, each of which contain a 
proline at position 173. A computer readable form of the substitute sequence listing is provided for 
use in examining this application. The contents of the computer readable form and the paper copy 
of the substitute sequence listing are identical. The contents of the substitute sequence listing are 
identical to those of the original sequence listing except for the insertion of the proline at position 173 
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in SEQ ID N0:7. 

The specification also has been amended to refer separately to each figure according to 37 
C.F.R. § 1.74 and to delete references to originally filed Figure 7, which was cancelled during 
prosecution. 

None of the amendments to the specification or sequence listing adds new matter. 

Respectfully submitted, 



Date: November 18, 1999 Ry <^5^ ^ ik(W«U 

Lisa M. Hemmendinger 
Registration No. 42,653 

Banner &Witcoff, Ltd. 
1001 G Street, N.W., Eleventh Floor 
Washington, D.C. 20001-4597 
(202) 508-9100 
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APC ANTIBODIES 



This application is a division, of application Ser. No. 
08/289,548, filed Aug. 12, 1994, which is a division of 
application Ser. No. 07/741,940 filed Aug. 8, 1991 (issued as 5 
U.S. Pat No. 5352,775). 

The U.S. Government has a paid-up license in this 
invention and the rigjit in limited circumstances to require 
the patent owner to license others on reasonable terms as 
provided for by the terms of grants awarded by the National io 
Institutes or Health. 

TECHNICAL AREA OF THE INVENTION 

The invention relates to the area of cancer diagnostics and 
therapeutics. More particularly, the invention relates to 
detection of the germline and somatic alterations of wild- 
type APC genes. In addition, it relates to therapeutic inter- 
vention to restore the function of APC gene product, 

BACKGROUND OF THE INVENTION 20 

According to the model of Knudson for tumorigenesis 
(Cancer Research, Vol 45, p. 1482, 1985), mere are tumor 
suppressor genes in all normal cells which, when they 
become non-functional due to mutation, cause neoplastic ^ 
development Evidence for mis model has been found in the 
cases of retinoblastoma and colorectal tumors. The impli- 
cated suppressor genes in those tumors, RB, p53, DCC and 
MCC, were found to be deleted or altered in many cases of 
the tumors studied (Hansen and Cavenee, Cancer Research, ^ 
Vol. 47, pp: 5518-5527 (1987); Baker et aL, Science, VoL 
244, p. 217 (1989); Fearon et aL, Science, VoL 247, p. 49 
(1990); Kinder et aL Science VoL 251. p. 1366 (1991).) 

In order to fully understand the pathogenesis of tumors, it 
will be necessary to identify the other suppressor genes that 35 
play a role in the tumorigenesis process. Prominent among 
these is the one(s) presumptively located at 5q21. Cytoge- 
netic (Herrera et aL, Am J. Med. Genet., VoL 25, p. 473 

(1986) a^dlinkage(LeppatctaL, Science, VoL 238, p. 1411 

(1987) ; Bodmer et aL, Nature, VoL 328, jp. 614 (1987)) 40 
studies have shown that this chromosome region harbors the 
gene responsible for familial adenomatous polyposis (FAP) 
and Gardner's Syndrome (GS). FAP is an autosomal- 
dominant, inherited disease in which affected individuals 
develop hundreds to thousands of adenomatous polyps, 45 
some of which progress to malignancy. GS is a variant of 
FAP in which desmoid tumors, osteomas and other soft 
tissue tumors occur together with multiple adenomas of the 
colon and rectum. A less severe form of polyposis has been 
identified in which only a few (2-40) polyps develop. This 50 
condition also is familial and is linked to the same chromo- 
somal markers as FAP and GS (Leppert et aL, New England 
Journal of Medicine, VoL 322, pp. 904-908, 1990.) 
Additionally, this chromosomal region is often deleted from 
the adenomas (Vogelstein et aL, N. EngL J. Med., VoL 319, 55 
p. 525 (1988)) and carcinomas (Vogelstein et aL, N. EngL J. 
Med, VoL 319, p. 525 (1988); Solomon et aL, Nature, VoL 
328, p. 616 (1987); Sasaki et aL, Cancer Research, VoL 49, 

p. 4402 (1989); Delattre et aL, Lancet, VoL 2, p. 353 (1989); 
and Ashton-Rickardt et aL, Oncogene, VoL 4, p. 1169 eo 
(1989)) of patients without FAP (sporadic tumors). Thus, a 
putative suppressor gene on chromosome 5q21 appears to 
play a role in the early stages of colorectal neoplasia in beth 
sporadic and familial tumors. 

Although the MCC gene has been identified on 5q21 as a 65 
candidate suppressor gene, it does not appear to be altered 
in FAP or GS patients. Thus there is a need in the art for 
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investigations of this chromosomal region to identify genes 
and to determine if any of such genes are associated with 
FAP and/or GS and the process of tumorigenesis. 

5 SUMMARY OF THE INVENTION 

It is an object of the present invention to provide a method 
for diagnosing and prognosing a neoplastic tissue of a 
human. 

10 It is another object of the invention to provide a method 

of detecting genetic predisposition to cancer. 
It is another object of the invention to provide a method 

of supplying wild-type APC gene function to a cell which 

has lest said gene function. 
15 It is yet another object of die invention to provide a kit for 

determination of the nucleotide sequence of APC alleles by 

the polymerase chain reaction. 
R is still another object of the invention to provide nucleic 

acid probes for detection of mutations in the human APC 
20 g Cnc . 

It is still another object of the invention to provide a 
cDNA molecule encoding the APC gene product 
It is yet another object of the invention to piovide a 
25 preparation of the human APC protein. 

It is another object of the invention to provide a method 
of screening for genetic predisposition to cancer. 

It is an object of the invention to provide methods of 
testing therapeutic agents for the ability to suppress neopla- 
30 sia. 

It is still another object of the invention to provide animals 
carrying mutant APC alleles. 

These and other objects of the invention are provided by 
one or more of the embodiments which are described below. 
35 In one embodiment of the present invention a method of 
diagnosing or prognosing a neoplastic tissue of a human is 
provided comprising: detecting somatic alteration of wild- 
type AFC genes or their expression products in a sporadic 
colorectal cancer tissue, said alteration indicating neoplasia 
40 of the tissue. 

In yet another embodiment a method is provided of 
detecting genetic predisposition to cancer in a human includ- 
ing familial adenomatous polyposis (FAP) and Gardner's 
45 Syndrome (GS), comprising: isolating a human sample 
selected from the group consisting of blood and fetal tissue; 
detecting alteration of wild-type APC gene coding 
sequences or their expression products from the sample, said 
alteration indicating genetic predisposition to cancer. 
5q Li another embodiment of the present invention a method 
is provided for supplying wild-type APC gene function to a 
cell which has lost said gene function by virtue of a mutation 
in the AFC gene, comprising: introducing a wild-type APC 
gene into a cell which has lost said gene function such that 
55 said wild-type gene is expressed in the celL 

In another embodiment a method of supplying wild-type 
APC gene function to a cell is provided comprising: intro- 
ducing a portion of a wild-type APC gene into a cell which 
has lost said gene function such mat said portion is 
6o expresseduimecelJUsaMportioncnccKlingarwtoftheAPC 
protein which is required for non-neoplastic growth of said 
celL APC protein can also be applied to cells or administered 
to animals to remediate for mutant APC genes. Synthetic 
peptides or drugs can also be used to mimic APC function 
65 in cells which have altered APC expression. 

In yet another embodiment a pair of single stranded 
primers is provided for determination of the nucleotide 
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sequence of the APC gene by polymerase chain reaction. 
The sequence of said pair of single stranded DN A primers is 
derived from chromosome 5q band 21, said pair of primers 
allowing synthesis of APC gene coding sequences. 

In still another embodiment of the invention a nucleic acid 5 
probe is provided which is complementary to human wild- 
type APC gene ceding sequences and which can form 
mismatches with mutant APC genes, thereby allowing then- 
detection by enzymatic or chemical cleavage or by shifts in 
electrophoretic mobility. 10 

la another embodiment of the invention a method is 
provided for detecting the presence of a neoplastic tissue in 
a human. The method comprises isolating a body sample 
from a human; detecting in said sample alteration of a 
wild-type APC gene sequence or wild-type APC expression 15 
product, said alteration indicating the presence of a neoplas- 
tic tissue in the human. 

In still another embodiment a cDNA molecule is provided 
which comprises the coding sequence of the APC gene. 

In even another embodiment a preparation of the human 
APC protein is provided which is substantially free of other 20 
human proteins. The amino acid sequence of the protein is 
shown in FIG. 3 (SEQ ID NOS: 7 and 2). 

In yet another embodiment of the invention a method is 
provided for screening for genetic predisposition to cancer, 
including familial adenomatous polyposis (FAP) and Gard- 25 ' 
ner's Syndrome (GS), in a human* The method comprises: 
detecting among kindred persons the presence of a DNA 
polymorphism which is linked to a mutant APC allele in an 
individual having a genetic predisposition to cancer, said 
kindred being genetically related to the individual, the 30 
presence of said polymorphism suggesting a predisposition 
to cancer. 

In another embodiment of the invention a method of 
testing therapeutic agents for the ability to suppress a 
neoplastically transformed phenotype is provided. The 35 
method comprises: applying a test substance to a cultured 
epithelial cell which carries a mutation in an APC allele; and 
determining whether said test substance suppresses me 
neoplastically transformed phenotype of the cell 

la another embodiment of the invention a method ci ^ 
testing merapeutic agents for the ability to suppress a 
neoplastically transformed phenotype is provided. The 
method comprises: administering a test substance to an 
animal which carries a mutant APC allele; and determining 
whether said test substance prevents or suppresses me ^ 
growth of tumors. 

In still other embodiments of the invention transgenic 
animals are provided. The animals carry amntant APC allele 
from a second animal species or have been genetically 
engineered to contain an insertion mutation which disrupts 
an APC allele. 50 

The present invention provides the art with the informa- 
tion that the APC gene, a heretofore unknown gene is, in 
fact, a target of mutational alterations on chromosome 5q21 
and that these alterations are associated with the process of 
tumorigenesis. This information allows highly specific 55 
assays to be performed to assess the neoplastic status of a 
particular tissue or the predisposition to cancer of an indi- 
vidual. This invention has applicability to Familial 
Adenomatous Polyposis, sporadic colorectal cancers, Gard- 
ner's Syndrome, as well as the less severe familial polyposis 60 
discusses above. 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1A shows an overview of yeast artificial chromo- 
some (YAC) contigs. Genetic distances between selected 65 
RFLP markers from within the contigs arc shown in centi- 
Morgans. 
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FIGS. 1B-1, 1B-2 and 1B-3 show a detailed map of the 
three central contigs. The position of the six identified genes 
from within the FAP region is shown; the 5 1 and 3' ends of 
the transcripts from these genes have in general not yet been 

5 isolated, as indicated by the string of dots surrounding the 
bars denoting the genes' positions. Selected restriction endo- 
nuclease recognition sites are indicated. B, BssH2; S, SstH; 
M, Mul; N, NruL 
FIGS. 2A and 2B show the sequence of TBI (FIG. 2A) 

io and TB2 (FIG. 2B) genes. The cDNA sequence of the TB 1 
gene was determined from the analysis of 11 cDNA clones 
derived from normal colon and liver, as described in the text 
A total of 2314 bp were contained within the overlapping 
cDNA clones, defining an ORF of 424 amino acids begin- 

15 ning at nucleotide 1. Only the predicted amino acids from 
the ORF are shown. The carboxy-terminal end of the ORF 
has apparently been identified, but the 5' end of the TBI 
transcript has not yet been precisely determined. 
The cDNA sequence of the TB2 gene was determined 

20 from the YS-39 clone derived as described in the text This 
clone consisted of 2300 bp and defined an ORF of 185 
amino acids beginning at nucleotide 1. Only the predicted 
amino acids are shown. The carboxy terminal end of the 
ORF has apparently been identified, but the 5* end of the 

25 TB2 transcript has not been precisely determined, 

FIGS. 3A-3F show the sequence of the APC gene product 
(SEQ ID NO:7). The cDNA sequence was determined 
through the analysis of 87 cDNA clones derived from 

^ normal colon, liver, and brain. A total of 8973 bp were 
contained within overlapping cDNA clones, defining an 
ORF of 2842 amino acids. In frame stop codons surrounded 
this ORF, as described in the text, suggesting that the entire 
APC gene product was represented in the ORF illustrated. 
Only the predicted amino adds are shown. 

FIGS. 4A and 4B show the local similarity between 
human APC (SEQ ID NO:2) and ral2 (SEQ ID NO:8) of 
yeast FIG. 4A shows amino acids 203 to 233 of APC, and 
FIG. 4B shows ainino acids 453 to 481 of APC Local 

40 sinau^tyanK>ngmeAPC(SEQIDN05)andMCCgenes 
(SEQ ID NO: 10) genes and the m3 nmscarinic acetylcholine 
receptor (SEQ ID NO:9) is shown. The region of the 
mAChR shown corresponds to that responsible for coupling 
the receptor to G proteins. The connecting lines indicate 

45 identities; dots indicate related amino acids residues. 

FIG. 5 shows the genomic map of the 1200 kb NotI 
fragment at the FAP locus. The NotI fragment is shown as 
a bold line. Relevant parts of the deletion chromosomes 
from patients 3214 and 3824 are shown as stippled lines. 

50 Probes used to characterize the NotI fragment and the 
deletions, and three YACs from which subclones were 
obtained, are shown below the restriction map. The chimeric 
end of YAC 183H12 is indicated by a dotted line. The 
orientation and approximate position of MCC are indicated 

55 above the map. 

FIG. 6A-6D show the DNAsequence (SEQ ID NO:3) and 
predicted amino add sequence of DPI (TB2) (SEQ ID 
NO:4). The nucleotide niimbering begins at the most 5* 
nucleotide isolated. A proposed initiation methionine (base 
60 77) is indicatedin bold type. The entire coding sequence is 
presented. 

FIG. 7A, FIG. 7B-1, andFIG. 7B-2 show the arrangement 
of exons in DP2.5 (APC). (A) Exon 9 corresponds to 
nucleotides 933-1312; exon 9a corresponds to nucleotides 
65 1236-1312. The stop codon in the cDNA is at nucleotide 
8535. (B) Partial intronic sequence surrounding each exon is 
shown (SEQ ID NO: 11-38). 5' intron sequences of exons 2, 
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3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15 are shown in SEQ 
ID NOS: 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 
38, respectively. 3' intron sequences of exons 1, 2, 3, 4, 5, 6, 
7, 8, 9, 10, 11, 12, 13, and 14 are shown in SEQ ID NOS: 
11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 5 
respectively. 

DETAILED DESCRIPTION 

It is a discovery of the present invention that mutational 
events associated with tumorigenesis occur in a previously 10 
unknown gene on chromosome 5q named here the APC 
(Adenomatous Polyposis Coil) gene. Although it was pre- 
viously known that deletion of alleles on chromosome 5q 
were common in certain types of cancers, it was not known 
that a target gene of these deletions was the APC gene. 15 
Further it was not known that other types of mutational 
events in the APC gene are also associated with cancers. The 
mutations of the APC gene can involve gross 
rearrangements, such as insertions and deletions. Point 
mutations have also been observed, 20 

According to the diagnostic and prognostic method of the 
present invention, alteration of the wild-type APC gene is 
detected. "Alteration of a wild-type gene" according to the 
present invention encompasses all forms of mutations — ^ 
including deletions. The alteration may be due to either 
rearrangements such as insertions, inversions, and deletions, 
or to point mutations. Deletions may be of the entire gene or 
only a portion of the gene. Somatic mutations are those 
which occur only in certain tissues, eg., in the tumor tissue, ^ 
and are not inherited in the gcrmline. Gennline mutations 
can be found in any of a body* $ tissues. If only a single allele 
is somatically mutated, an early neoplastic state is indicated. 
However, if both alleles are mutated men a late neoplastic 
state is indicated. The finding of APC mutations thus pro- 35 
vides both diagnostic and prognostic informatioDL An APC 
allele which is not deleted (&g., that on the sister chromo- 
some to a chromosome carrying an APC deletion), can be 
screened for other mutations, such as insertions, small 
deletions, and point mutations. It is believed that many ^ 
mutations found in tumor tissues will be (hose leading to 
decreased expression of the APC gene product However, 
mutations leading to non-functional gene products would 
also lead to a cancerous state. Point events may 

o<xur mreguktccy regions, such as m me prom the 45 
gene, leading to loss or diminution of expression of the 
mRNA^Point mutations may also abolish proper RNA 
processing, leading to loss of expression of the APC gene 
product 

In order to detect the alteration of the wild4ype APC gene 50 
in a tissue, it is helpful to isolate the tissue free from 
surrounding normal tissues. Means for endching a tissue 
preparation for tumor cells are known in the art For 
example, the tissue may be isolated from pa raffin or cryostat 
sections. Cancer cells may also be separated from normal 55 
cells by flow cytometry. These as well as other techniques 
for separating tumor from normal cells are well known in the 
art If the tumor tissue is highly contaminated with normal 
cells, detection of mutations is more difficult 

Detection of point mutations may be accomplished by 60 
molecular cloning of the APC allele (or alleles) and sequenc- 
ing that alleles) using techniques well known in the art 
Alternatively, the polymerase chain reaction (PCR) can be 
used to amplify gene sequences directly from a genomic 
DNA preparation from the tumor tissue. The DNA sequence 65 
of the amplified sequences can then be determined. The 
polymerase chain reaction itself is well known in the art 
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See, e.g., SaiM et aL, Science, Vol. 239, p. 487, 1988; U.S. 
Pat No. 4,683,203; and U.S. Pat No. 4,683,195. Specific 
primers which can be used in order to amplify the gene will 
be discussed in more detail below. The ligase chain reaction, 

5 which is known in the art can also be used to amplify APC 
sequences. See Wu et al., Genomics. Vol. 4, pp. 560-569 
(1989). In addition, a technique known as allele specific 
PCR can be used. (See Ruano and Kidd, Nucleic Acids 
Research, Vol 17, p. 8392, 1989.) According to this 

io technique, primers are used which hybridize at their 3* ends 
to a particular APC mutation. If the particular APC mutation 
is not present, an amplification product is not observed. 
Amplification Refractory Mutation System (ARMS) can 
also be used as disclosed in European Patent Application 

is Publication No. 0332435 and in Newton et al., Nucleic 
Acids Research, Vol 17, p.7, 1989. Insertions and deletions 
of genes can also be detected by cloning, sequencing and 
amplification. In addition, restriction fragment length poly- 
morphism (RFLP) probes for the gene or surrounding 

20 marker genes can be used to score alteration of an allele or 
an insertion in a polymorphic fragment Such a method is 
particularly useful for screening among kindred persons of 
an affected individual for the presence of the APC mutation 
found in that individual Single stranded conformation poly- 

25 morphism (SSCP) analysis can also be used to detect base 
change variants of an allele. (Orita et al., Proc. NatL Acad 
Sci USA VoL 86, pp. 2766-2770, 1989, and Genomics, VoL 
5, pp. 874-879, 1989.) Other techniques for detecting inser- 
tions and deletions as are known in the art can be used. 

30 Alteration of wild-type genes can also be detected on the 
basis of the alteration of a wild-type expression product of 
the gene. Such expression products include both the APC 
mRNA as well as the APC protein product The sequence s 
of thes e products are shown in FIG. 3. Point mutations may 

35 be detected by amplifying and sequencing the mRNA or via 
molecular cloning of cDNA made from the mRNA. The 
sequence of the cloned cDNAcan be determined using DNA 
sequencing techniques which are well known in the act The 
cDNA can also be sequenced via the polymerase chain 

40 reaction (PCR) which will be discussed in more detail 
below. 

Mismatches, according to the present invention are 
hybridized nucleic add duplexes which are not 100% 
homologous. The lack of total homology may be due to 

45 deletions, insertions, inversions, substitutions or frameshift 
mutations. Mismatch detection can be used to detect point 
mutations in the gene or its mRNA product While these 
techniques are less sensitive man sequencing, they are 
simpler to perform on a large number of tumor s amples . An 

50 example of a mismatch cleavage technique is the RNase 
protection method, which is described in detail in Winter et 
aL, Proa NatL Acad. ScL USA, VoL 82, p. 7575, 1985 and 
Meyers et aL, Science, VoL 230, p. 1242, 1985. In the 
practice of the present invention the method involves the use 

55 of a labeled riboprobe which is complementary to the human 
wild-type APC gene coding sequence. The riboprobe and 
either mRNA or DNA isolated from the tumor tissue are 
annealed (hybridized) together and subsequently digested 
with the enzyme RNase A which is able to detect some 

60 mismatches in a duplex RNA structure. If a mismatch is 
detected by RNase A, it cleaves at the site of the mismatch. 
Thus, when the annealed RNApreparation is separated on an 
electrophoretic gel matrix, if a mismatch has been detected 
and cleaved by RNase A, an RNA product will be seen 

65 which is smaller than the full-length duplex RNA for the 
riboprobe and the mRNA or DNA. The riboprobe need not 
be the full length of the APC mRNA or gene but can be a 
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segment of either. II the riboprobe comprises only a segment 
of the APC mRNA or gene it will be desirable to use a 
number of these probes to screen the whole mRNA sequence 
for mismatches. 

In similar fashion, DNA probes can be used to detect 5 
mismatches, through enzymatic or chemical cleavage. See, 
eg., Cotton et at, Proc. Natl. Acad. Sci. USA, VoL 85, 4397, 
1988; and Shenk et aL, Proc. Natl. Acad. ScL USA, VoL 72, 
p. 989, 1975. Alternatively, mismatches can be detected by 
shifts in the electrophoretic mobility of mismatched 10 
duplexes relative to matched duplexes. See, e.g., Cariello, 
Human Genetics, VoL 42, p. 726, 1988. With either ribo- 
probes or DNA probes, the cellular mRNA or DNA which 
might contain a mutation can be amplified using PCR (see 
below) before hybridization. Changes in DNA of the APC l5 
gene can also be detected using Southern hybridization, 
especially if the changes are gross rearrangements, such as 
deletions and insertions. 

DNA sequences of the APC gene which have been 
amplified by use of polymerase chain reaction may also be 20 
screened using aliele-speciflc probes. These probes are 
nucleic acid oligomers, each of which contains a region of 
the APC gene sequence harboring a known mutation. For 
example, one oligomer may be about 30 nucleotides in 
length, corresponding to a portion of the A PC gene 25 
sequence. By use of a battery of such aliele-speciflc probes, 
PCR. amplification products can be screened to identify the 
presence of a previously identified mutation in the APC 
gene. Hybridization of allele-specific probes with amplified 
APC sequences can be performed, for example, on a nylon 30 
filter. Hybridization to a particular probe under stringent 
hybridization conditions indicates the presence of the same 
mutation in the tumor tissue as in the allele-specific probe. 

Alteration of APC mRNA expression can be detected by 
any technique known in the art These include Northern blot 35 
analysis, PCR amplification and RNase protection. Dimin- 
ished mRNA expression indicates an alteration of the wild- 
type APC gene. Alteration of wild-type APC genes can also 
be detected by screening for alteration of wild-type APC 
protein. For example, monoclonal antibodies irrimnnoreac- 40 
five with APC can be used to screen a tissue. Lack of cognate 
antigen would indicate an APC mutation. Antibodies spe- 
cific for products of mutant alleles could also be used to 
detect mutant APC gene product Such immunological 
assays can be done in any convenient format known in the 45 
art These include Western blots, immunohistochemical 
assays and ELBA assays. Any means for detecting an 
altered APC protein can be used to detect alteration of 
wild-type APC genes. Functional assays can be used, such as 
protein binding determinations. For example, it is believed 50 
that APC protein oligomerizes to itself and/or MCC protein 
or binds to a G protein . Thus , an assay for the ability to bind 
to wild type APC or MCC protein or that G protein can be 
employed. In addition, assays can be used which detect APC 
biochemical function. It is believed that APC is involved in 55 
phospholipid metabolism. Thus, assaying the enzymatic 
products of the involved phospholipid metabolic pathway 
can be used to determine APC activity. Finding a mutant 
APC gene product indicates alteration of a wild-type APC 
gene, 60 

Mutant APC genes or gene products can also be detected 
in other human body samples, such as, serum, stool, urine 
and sputum- The same techniques discussed above for 
detection of mutant APC genes or gene products in tissues 
can be applied to other body samples. Cancer cells are 65 
sloughed off from tumors and appear in such body samples. 
In addition, the APC gene product itself may be secreted into 
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the extracellular space and found in these body samples even 
in the absence of cancer cells. By screening such body 
samples, a simple early diagnosis can be achieved for many 
types of cancers. In addition, the progress of chemotherapy 

5 or radiotherapy can be monitored more easily by testing 
such body samples for mutant APC genes or gene products. 

The methods of diagnosis of the present invention are 
applicable to any tumor in which APC has a role in tum- 
origenesis. Deletions of chromosome arm 5q have been 

10 observed in tumors of lung, breast colon, rectum, bladder, 
liver, sarcomas, stomach and prostate, as well as in leuke- 
mias and lymphomas. Thus these are likely to be tumors in 
which APC has a role. The diagnostic method of the present 
invention is useful for clinicians so that they can decide upon 

15 an appropriate course of treatment For example, a tumor 
displaying alteration of both APC alleles might suggest a 
more aggressive therapeutic regimen man a tumor display- 
ing alteration of only one APC allele. 
The primer pairs of the present invention are useful for 

20 determination of the nucleotide sequence of a particular 
APC allele using the polymerase chain reaction. The pairs of 
single stranded DNA primers can be annealed to sequences 
within or surrounding the APC gene on chromosome 5q in 
order to prime amplifying DNA synthesis of the APC gene 

25 itself.Aconipletesetofmeseprir^ 

of the nucleotides of the APC gene coding sequences, Le., 
the exons. The set of primers preferably allows synthesis of 
both intron and exon sequences. Allele specific primers can 
also be used. Such primers anneal only to particular APC 

30 mutant alleles, and thus will only amplify a product in the 
presence of the mutant allele as a template. 

In order to facilitate subsequent cloning of amplified 
sequences, primers may have restriction enzyme site 
sequences appended to their 5' ends. Thus, all nucleotides of 

35 the primers are derived from APC sequences or sequences 
adjacent to APC except the few nucleotides necessary to 
form a restriction enzyme site. Such enzymes and sites are 
well known in the art The primers themselves can be 
synthesized using techniques which are well known in the 

40 art Generally, the primers can be made using oligonucle- 
otide synthesizing machines which are cormnercially avail- 
able. Given the sequence of the APC open reading frame 
shown in FIG. 3 (SEQ ID NO:l), design of particular 
primers is well within the skOl of the art 

45 The nucleic acid probes provided by the present invention 
are useful for a number of purposes. They can be used in 
Southern hybridization to genomic DNA and in the RNase 
protection method for detecting point mutations already 
discussed above. The probes can be used to detect PCR 

50 amplification products. They may also be used to detect 
mismatches with the APC gene or mRNA using other 
techniques. Mismatches can be detected using either 
enzymes (e.g., SI nuclease), chemicals (e.g., hydroxylamine 
or osmium tetroxide and piperidine), or changes in electro- 

ss phoretic mobility of mismatched hybrids as compared to 
totally matched hybrids. These techniques are known in the 
art See, Cotton, supra, Shenk, supra, Myers, supra, Winter, 
supra, and Novack et aL, Proa Natl. Acad. Set USA, Vol 
83, p. 586, 1986. Generally, the probes are complementary 

60 to APC gene coding sequences, although probes to certain 
introns are also contemplated. An entire battery of nucleic 
acid probes is u sed to compos e a kit for detecting alteration 
of wild-type APC genes. The kit allows for hybridization to 
the entire APC gene. The probes may overlap with each 

65 other or be contiguous. 

If a riboprobe is used to detect mismatches with mRNA, 
it is complementary to the mRNA of the human wild-type 
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APC gene. The riboprobe thus is an anti-sense probe in that 
it does not code for the APC protein because it is of the 
opposite polarity to the sense strand. The riboprobe gener- 
ally will be labeled with a radioactive, colorimetric, or 
fiuorometric material, which can be accomplished by any 5 
means known in the art If the riboprobe is used to detect 
mismatches with DNA it can be of either polarity, sense or 
anti-sense. Similarly, DNA probes also may be used to detect 
mismatches. 

Nucleic acid probes may also be complementary to 10 
mutant alleles of the APC gene. These are useful to detect 
similar mutations in other patients on the basis of hybrid 
ization rather than mismatches. These are discussed above 
and referred to as allele-specific probes. As mentioned 
above, the A PC probes can also be used in Southern 15 
hybridizations to genomic DNA to detect gross chromo- 
somal changes such as deletions and insertions. The probes 
can also be used to select cDNA clones of APC genes from 
tumor and normal tissues. In addition, the probes can be used 
to detect APC mRNA in tissues to determine if expression is 20 
diminished as a result of alteration of wild-type APC genes. 

According to the present invention a method is also 
provided of supplying wild-type APC function to a cell 
which carries mutant APC alleles. Supplying such function 
should suppress neoplastic growth of the recipient cells. The 25 
wild-type APC gene or a part of the gene may be introduced 
into the cell in a vector such that the gene remains extra- 
chromosomal. In such a situation the gene will be expressed 
by the cell from the extrachromosomal location. If a gene 
portion is introduced and expressed in a cell carrying a 30 
mutant APC allele, me gene portion should encode a part of 
the APC protein which is required for non-neoplastic growth 
of the celL More preferred is the situation where the wild- 
type APC gene or apart of it is introduced into the mutant 
cell in such a way mat it recombines with the endogenous 35 
mutant AFC gene present in the celL Such recombination 
requires a double recombination event which results in me 
correction of the APC gene mutation. Vectors for introduc- 
tion of genes beth for recombination and for extrachromo- 
somal maintenance are known in the art and any suitable 40 
vector may be used Methods for introducing DNA into cells 
such as dectroporation, calcium phosphate cc-precipitauon 
and viral transduction are known in the art and the choice of 
method is within the competence of the routineer. Cells 
transformed with the wild-type A PC gene can be used as 45 
model systems to study cancer remission and drug treat- 
ments which promote such remission. 

Similarly, cells and animals which carry a mutant APC 
allele can be used as model systems to study and test for 
substances which have potential as therapeutic agents. The so 
cells are typically cultured epithelial cells. These may be 
isolated from individuals with APC mutations, either 
somatic or germline. Alternatively, the cell line can be 
engineered to cany the mutation in the APC allele. After a 
test substance is applied to the cells, the neoplastic^ 55 
transformed pheno-type of the cell will be determined Any 
trait of neoplastkally transformed cells can be assessed, 
including anchorage-independent growth, tumorigemcity in 
nude mice, invasiveness of cells, and growth factor depen- 
dence. Assays for each of these traits are known in the art 60 

Animals for testing therapeutic agents can be selected 
after mutagenesis of whole animals or after treatment of 
germline cells or zygotes. Such treatments include insertion 
of mutant A PC alleles, usually from a second animal 
species, as well as insertion of disrupted homologous genes. 65 
Alternatively, the endogenous APC gene(s) of the animals 
may be disrupted by insertion or deletion mutation. After test 
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substances have been administered to the animals, the 
growth of tumors must be assessed. If the test substance 
prevents or suppresses the growth of tumors, then the test 
substance is a candidate therapeutic agent for the treatment 
5 of FAP and/or sporadic cancers. 

Polypeptides which have APC activity can be supplied to 
cells which carry mutant or missing APC alleles. The 
sequence of the APC protein is disclosed in FIG. 3 (SEQ ID 
NO:7). These two sequences differ slightly and appear to be 

10 indicate the existence of two different forms of the APC 
protein. Protein can be produced by expression of the cDNA 
sequence in bacteria, for example, using known expression 
vectors. Alternatively, APC can be extracted from APC- 
producing mammalian cells such as brain cells. In addition, 

15 the techniques of synthetic chemistry can be employed to 
synthesize APC protein. Any of such techniques can provide 
the preparation of the present invention which comprises the 
APC protein. The preparation is substantially free of other 
human proteins. This is most readily accomplished by 

20 synthesis in a microorganism or in vitro. 

Active APC molecules can be introduced into cells by 
microinjection or by use of liposomes, for example. 
Alternatively, some such active molecules may be taken up 
by cells, actively or by diffusion. Extracellular application of 
APC gene product may be sufficient to affect tumor growth. 
Supply of molecules with APC activity should lead to a 
partial reversal of the neoplastic state. Other molecules with 
APC activity may also be used to effect such a reversal, for 

3Q example peptides, drugs, or organic compounds. 

The present invention also provides a preparation of 
antibodies immunoreactive with a human APC protein. The 
antibodies may be polyclonal or monoclonal and may be 
raised against native APC protein, APC fusion proteins, or 

35 mutant APC proteins. The antibodies should be immunore- 
active with APC epitopes, preferably epitopes not present on 
other human proteins. In a preferred embodiment of the 
invention the antibodies will inmojunopretipitate APC pro- 
teins from solution as well as react with APC protein on 

^ Western or iinmunoblots of polyacrylamide gels. In another 
preferred embodiment, the antibodies will detect APC pro- 
teins in paraffin or frozen tissue sections, using immunocy- 
tochemical techniques. Techniques for raising and purifying 
antibodies are well known in the art and any such techniques 

45 may be chosen to achieve the preparation of the invention. 
Predisposition to cancers as in FAP and GS can be 
ascertained by testing any tissue of a human for mutations of 
the APC gene. For example, a person who has inherited a 
germline APC mutation would be prone to develop cancers. 

50 This can be determined by testing DNA from any tissue of 
the person's body. Most simply, blood can be drawn and 
DNA extracted from the cells of the blood. In addition, 
prenatal diagnosis can be accomplished by testing fetal cells, 
placental cells, or amniotic fluid for mutations of the APC 

55 gene. Alteration of a wild-type APC allele, whether for 
example, by point mutation or by deletion, can be detected 
by any of the means discussed above. 

Molecules of cDNAaccording to the present invention are 
intron-free, APC gene ceding molecules. They can be made 

6o by reverse transcriptase using the APC mRNA as a template. 
These molecules can be propagated in vectors and cell lines 
as is known in the art Such molecules have the sequence 
shown in SEQ ID N03. The cDNA can also be made using 
the techniques of synthetic chemistry given the sequence 

65 disclosed herein. 

A short region of homology has been identified between 
APC and the human m3 muscarinic acetylcholine receptor 
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(mAChR). This homology was largely confined to 29 resi- 
dues in which 6 out of 7 amino acids (EL(GorA)GLQA) 
were identical (See FIG. 4 (SEQ ID NO: 9)). InitiaUy, it was 
not known whether this homology was significant, because 
many other proteins had higher levels of global homology 5 
(though few had six out of seven contiguous amino acids in 
common). However, a study on the sequence elements 
controlling G protein activation by mAChR subtypes 
(Lechleiter et al., EMBO J., p. 4381 (1990)) has shown that 
a 21 amino acid region from the m3 mAChR completely 10 
mediated G protein specificity when substituted for the 21 
amino acids of m2 mA ChR at the analogous protein 
position. These 21 residues overlap the 19 amino acid 
homology between APC and m3 mA ChR. 

This connection between APC and the G protein activat- 15 
ing region of mAChR is intriguing in light of previous 
investigations relating G proteins to cancer. For example, the 
RAS oncogenes, which are often mutated in colorectal 
cancers (Vogelstein, et aL, N. Engl. J. Med., Vol. 3 19, p. 525 

(1988) ; Bos et aL, Nature Vol 327, p. 293 (1987)), are 20 
members of the ( 1 protein family (Bourne, et al, Nature* Vol. 
348, p. 125 (1990)) as is an in vitro transformation suppres- 
sor (Noda et al., Proc. Natl Acad. ScL USA, Vol 86, p. 162 

(1989) ) and genes mutated in hormone producing tumors 
(Candis et al., Nature, Vol. 340, p. 692 (1989); Lyons et aL, 25 
Science, VoL 249, p. 655 (1990)). Additionally, the gene 
responsible for neurofibromatosis (presumably a tumor sup- 
pressor gene) has been shown to activate the GTPase 
activity of RAS (Xu et aL, CelL VoL 63, p. 835 (1990); 
Martin et al.. Cell, VoL 63, p. 843 (1990); Ballester et aL, 
Cell, VoL 63, p. 851 (1990)). Another interesting link 
between G proteins and colon cancer involves the drug 
sulindac. This agent has been shown to inhibit the growth of 
benign colon tumors in patients with FAP, presumably by 
virtue of its activity as a cyclooxygenase inhibitor (Waddell 35 
et aL, J. Surg. Oncology 24(1), 83 (1983); Wadell, et aL, Am. 

J. Surg., 157(1), 175 (1989); Charneau et aL, Gastroenter- 
ologie Oinique at Biologique 14(2), 153 (1990)). Cyclooxy- 
genase is required to convert arachidonic acid to prostag- 
landins and other biologically active molecules. G proteins 40 
are known to regulate phospholipase A2 activity, which 
generates arachidonic acid from phospholipids (Role et aL, 
Proa Natl. Acad. ScL USA, VoL 84, p. 3623 (1987); Kurachi 
et aL, Nature, VoL 337, 12 555 (1989)). Therefore we 
propose that wild-type APC protein functions by interacting 45 
with a G protein and is involved in phospholipid metabo- 
lism 

The following are provided for exemplification purposes 
only and are not intended to limit the scope of the invention 
which has been described in broad terms above. 50 

EXAMPLE 1 

This example demonstrates the isolation of a 5.5 Mb 
region of human DNAlinked to the FAP locus. Six genes are 55 
identified in this region, all of which are expressed in normal 
colon cells and in colorectal, lung, ad bladder tumors. 

The cosmid markers YN5.64 and YN5.48 have previously 
been shown to delimit an 8 cM region containing the locus 
for FAP (Nakamura et aL, Am. J. Hum. Genet VoL 43, p. 60 
638 (1988)). Further linkage and pulse-field gel electro- 
phoresis (PFGE) analysis with additional markers has shown 
that the FAP locus is contained within a 4 cM region 
bordered by cosmids EF5.44 and L5.99. In order to isolate 
clones representing a significant portion of this locus, a yeast 65 
artificial chromosome (YAQ library was screened with 
various 5q21 markers. Twenty-one YAC clones, distributed 
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within six contigs and including 5.5 Mb from the region 
between YN5.64 and YN5.48, were obtained (FIG. 1A). 

Three contigs encompassing approximately 4 Mb were 
contained within the central portion of mis region. The 

5 YACs constituting these contigs, together with the markers 
used for their isolation and orientations, are shown in FIG. 
1. These YAC contigs were obtained in the following way. 
To initiate each contig, the sequence of a genomic marker 
cloned from chromosome 5q21 was determined and used to 

10 design primers for PCR PCR was then carried out on pools 
of YAC clones distributed in microtiter trays as previously 
described (Anand et aL, Nucleic Acids Research, Vol 18, p. 
1951 (1980)). Individual YAC clones from me positive pools 
were identified by further PCR or hybridization based 

15 assays, and the YAC sizes were determined by PFGE. 
To extend the areas covered by the original YAC clones, 
"chromosomal walking" was performed. For mis purpose, 
YAC termini were isolated by a PCR based method and 
sequenced (Riley et aL, Nucleic Acids Research, Vol. 18, p. 

2Q 2887 (1990)). PCR primers based on these sequences were 
then used to rescreen the YAC library. For example, the 
sequence from an intron of the FER gene (Hao et aL, MoL 
Cell. Biol., Vol. 9, p. 1587 (1989)) was used to design PCR 
primers for isolation of the 28EC1 and 5EH8 YACs. The 

25 termini of me 28EC1 YAC were sequenced to derive mark- 
ers RHE28 and LHE28, respectively. The sequences of these 
two markers were men used to isolate YAC clones 15CH12 
(fromRHE28) and40CFl and29EFl (from LHE28). These 
five YACs formed a contig encompassing 1200 kb (contig 

30 1,FEG.1B). 

Similarly, contig 2 was initiated using cosmid N5.66 
sequences, and contig 3 was initiated using sequences both 
from the MCC gene and from cosmid EF5.44, A walk in the 
telomeric direction from YAC 14FH1 and a walk in the 

35 opposite direction from YAC 39GG3 allowed connection of 
the initial contig 3 clones through YAC 37HG4 (FIG. IB). 
YAC37HG4 was deposited at the National Collection of 
Industrial and Marine Bacteria (NQMB), P.O. Box 31, 23 
St Machar Drive, Aberdeen AB2 1RY, Scotland, under 

40 Accession No. 40353 on Dec. 17, 1990. 

Multipoint linkage analysis with the various markers u sed 
to define the contigs, combined with PFGE analysis, showed 
that contigs 1 and 2 were centromeric to contig 3. These 
contigs were used as tools to orient and/or identify genes 

45 which might be responsible for FAR Six genes were found 
to lie within mis cluster of YACs, as follows: 

Contig #1: FER— The FER gene was discovered through 
its homology to the viral oncogene ABL (Hao et aL, supra). 
It has an intrinsic tyrosine kinase activity, and in situ 

50 hybridization with an FER probe showed that the gene was 
located at 5qll-23 (Morris et aL, Cytogenet CelL Genet., 
VoL 53, p. 4, (1990)). Because of the potential role of this 
oncogene-related gene in neoplasia, we decided to evaluate 
it further with regards to the FAP locus. A human genomic 

55 clone from FER was isolated (MF 23) and used to define a 
restriction fragment length polymorphism (RFLP), and the 
RFLP in turn used to map FER by linkage analysis using a 
panel of three generation families. This showed that FER 
was very tightly linked to previously defined polymorphic 

60 markers for the FAP locus. The genetic mapping of FER was 
complemented by physical mapping using the YAC clones 
derived from FER sequences (FIG. IB). Analysis of YAC 
contig 1 showed mat FER was within 600 kb of cosmid 
marker M5.28, which maps to within 15 Mb of cosmid 

65 L5.99 by PFGE of human genomic DNA. Thus, the YAC 
mapping results were consistent with the FER linkage data 
and PFGE analyses. 
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Contig 2: TBI — TBI was identified through a cross- 
hybridization approach. Exons of genes are often evolution- 
arily conserved while introns and intergenie regions are 
much less conservecL Thus, it a human probe cross- 
hybridizes strongly to the DNA from non-primate species, 5 
there is a reasonable chance that it contains exon sequences. 
Subclones of the cosmids shown in EEG. 1 were used to 
screen Southern blots containing rodent DNA samples. A 
subclone of cosmid N5.66 (p 5.66-4) was shown to strongly 
hybridize to rodent DNA, and this clone was used to screen 10 
cDNA libraries derived from normal adult colon and fetal 
liver. The ends of the initial cDNA clones obtained in this 
screen were then used to extend the cDNA sequence. 
Eventually, 11 cDNA clones were isolated, covering 2314 
bp. The gene detected by these clones was named TBI. 15 
Sequence analysis of the overlapping clones revealed an 
open reading frame (ORF) that extended for 1302 bp starting 
from the most 5* sequence data obtained (FIG. 2A). If this 
entire open reading frame were translated, it would encode 
434 amino acids (SEQ ID N03). The product of this gene ^ 
was not globally homologous to any other sequence in the 
current database but showed two significant local similari- 
ties to a family of ADP, ATP carrier/translocator proteins and 
mitochondrial brown fat uncoupling proteins which are 
widely distributed from yeast to mammals. These conserved 25 
regions of TBI (underlined in FIG. 2A) may define a 
predictive motif for this sequence family. In addition, TBI 
appeared to contain a signal peptide (or mitochondrial 
targeting sequence) as well as at least 7 transmembrane 
domains. ^ 

Contig 3: MCC, TB2, SRP and APC— The MCC gene 
was also discovered through a cross-hybridization approach, 
as described previously (Kinzler et aL, Science Vol 25 1, p. 
1366 (1991)). The MCC gene was considered a candidate 
for causing FAP by virtue of its tight genetic linkage to FAP 35 
susceptibility and its somatic mutation in sporadic colorectal 
carcinomas. However, mapping experiments suggested that 
the ceding region of MCC was approximately 50 kb proxi- 
mal to the centromeric end of a 200 kb deletion found in an 
FAP patient MCC cDNA probes detected a 10 Kb mRNA 40 
transcript on Northern blot analysis of which 4151 bp, 
including the entire open reading frame, have been cloned. 
Although the 3' non-translated portion or an alternatively 
spliced form of MCC might have extended into this deletion, 
it was possible that the deletion did not affect the MCC gene 45 
product We therefore used MCC sequences to initiate a 
YAC contig, and subsequently used the YAC clones to 
identify genes 50 to 250 kb distal to MCC that might be 
contained within the deletion. 

In a first approach, the insert from YAC24ED6 (FIG. IB) 50 
was radiolabeled and hybridized to a cDNA library from 
normal colon. One of the cDNA clones (YS39) identified in 
this manner detected a 3.1 Kb mRNA transcript when used 
as a probe for Northern blot hybridization. Sequence analy- 
sis of the YS39 clone revealed that it encompassed 2283 55 
nucleotides and contained an ORF that extended for 555 bp 
from the most 5' sequence data obtained. If all of this ORF 
were translated, it would encode 1S5 amino acids (SEQ ID 
NO:6) (FIG. 2B). The gene detected by YS39 was named 
TB2. Searches of nucleotide and protein databases revealed 60 
that the TB2 gene was not identical to any previously 
reported sequences nor were there any striking similarities. 

Another clone (YS11) identified through the YAC 24ED6 
screen appeared to contain portions of two distinct genes. 
Sequences from one end of YS11 were identical to at least 65 
180 bp of the signal recognition particle protein SRP19 
(Lingelbach et al. Nucleic Acids Research, Vol. 16, p. 943 1 
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(1988). A second ORF, from the opposite end of clone YS11, 
proved to be identical to 78 bp of a novel gene which was 
independently identified through a second YAC-based 
approach. For the latter, DNA from yeast cells containing 

5 YAC 14FH1 (FIG. IB) was digested with EcoRI and sub- 
cloned into a plasmid vector. Plasmids that contained human 
DNAfragments were selected by colony hybridization using 
total human DNA as a probe. These clones were then used 
to search for cross-hybridizing sequences as described above 

!0 for TB 1, and the cross-hybridizing clones were subsequently 
used to screen cDNA libraries. One of the cDNA clones 
discovered in this way (FH38) contained a long ORF (2496 
bp), 78 bp of which were identical to the above-noted 
sequences in YS11. The ends of the FH38 cDNA clone were 

!5 then used to initiate cDNA walking to extend the sequence. 
Eventually, 85 cDNA clones were isolated from normal 
colon, brain and liver cDNA libraries and found to encom- 
pass 8973 nucleotides of contiguous transcript The gene 
corresponding to this transcript was named APC When used 

20 as probes for Northern blot analysis, APC cDNA clones 
hybridized to a single transcript of approximately 9.5 Kb, 
suggesting mat the great majority of the gene product was 
represented in the cDNA clones obtained. Sequences from 
the 5* end of the APC gene were found in YAC 37HG4 but 

25 not in YAC 14FH1. However, the 3' end of the APC gene 
was found in 14FH1 as well as 37HG4. Analogously, the 5' 
end of the MCC ceding region was found in YAC clones 
19AA9 and 266C3 but not 24ED6 or 14FH1, while the 3' 
end displayed the opposite pattern. Thus, MCC and APC 

30 transcription units pointed in opposite directions, with the 
direction of transcription going from centromeric to telom- 
eric in the case of MCC, and telomeric to centromeric in the 
case of APC PFGE analysis of YAC DNA digested with 
various restriction endonucleases showed mat TB2 and SRP 

35 were between MCC and APC, and that the 3* ends of the 
ceding regions of MCC and APC were separated by approxi- 
mately 150 kb (FIG. IB). 

Sequence analysis of the APC cDNA clones revealed an 
open reading frame of 8,535 nucleotides. The 5* end of the 

40 ORF contained a methionine codon (codon 1) that was 
preceded by an in-frame stop codon 9 bp upstream, and the 
3' end was followed by several in-frame stop codons. The 
protein produced by initiation at codon 1 would contain 
2,842 amino acids (SEQ ID NO:7) (FIG. 3). The results of 

45 database searching with the APC gene product were quite 
complex due to the presence of large segments with locally 
biased amino acid compositions. In spite of this, APC could 
be roughly divided into two domains. The N-terminal 25% 
of the protein had a high content of leucine residues (12%) 

so and showed local sequence similarities to myosins, various 
intermediate filament proteins (e.g., desmin, vimentin, 
neurofilaments) and Drosophila armadillo/human plakoglo- 
bin. The latter protein is a component of adhesive junctions 
(desmosomes) joining epithelial cells (Franke et aL, Proa 

55 Natl. Acad. ScL U.SA., Vol 86, p. 4027 (1989);PerferetaL, 
Cell, Vol 63, p. 1167 (1990)) The ^terminal 75% of APC 
(residues 731-2832) is 17% serine by composition with 
serine residues more or less uniformly distributed. This large 
domain also contains local concentrations of charged 

60 (mostly acidic) and proline residues. There was no indica- 
tion of potential signal peptides, transmembrane regions, or 
nuclear targeting signals in APC, suggesting a cytoplasmic 
localization. 

To detect short similarities to APC, a database search was 
65 r^crmedusmg me PAM-40 matrix (AltschuL J.MoLBio., 
Vol. 219, p. 555 (1991). Potentially interesting matches to 
several proteins were found. The most suggestive of these 
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involved the ra!2 gene product of yeast, which is implicated 
in the regulation of ras activity (Fukul et aL, MoL CelL BioL, 
Vol. 9, p. 5617 (1989)). Little is known about how ra!2 might 
interact with ras but it is interesting to note the positively- 
charged character of this region in the context of the 5 
negatively-charged GAP interaction region of ras. A specific 
electrostatic interaction between ras and GAP-rdated pro- 
teins has been proposed 

Because of the proximity of the MCC and APC genes, and 
the fact that both am implicated in colorectal tumorigenesis, 10 
we searched for similarities between the two predicted 
proteins. Bourne has previously noted that MOC has the 
potential to form alpha helical coiled coils (Nature, Vol 351, 
p. 188 (1991). Lupas and colleagues have recently devel- 
oped a program for predicting coiled coil potential from 15 
primary sequence data (Science, VoL 252, p. 1162 (1991) 
and we have used their program to analyze bom MCC and 
APC Analysis of MCC indicated a discontinuous pattern of 
coiled-coil domains separated by putative **hinge w or 
"sparer*' regions similar to those seen in laminin and other 20 
intermediate filament proteins. Analysis of the APC 
sequence revealed two regions in the N-tenninal domain 
which had strong coiled cofl-fonning potential, and these 
regions corresponded to those that showed local similarities 
with myosin and IF proteins on database searching. In 25 
addition* one other putative coiled cofl region was identified 
in the central region of APC. The potential for both APC and 
MCC to form coiled coils is interesting in that such struc- 
tures often mediate homo- and hetero-oligomeazation. 

Finally, it had previously been noted that MCC shared a 30 
short similarity with the region of the m3 muscarinic ace- 
tylcholine receptor (mAChR) known to regulate specificity 
of G-protein coupling. The APC gene also contained a local 
similarity to the region of the m3 mAChR (SEQ ID NO:9) 
matwerlar^wimmeM(XsiinMty(S 35 
(FIG. 4B). Although the similarities to ral2 (SEQ ID NO :8) 
(FIG- 4A) andm3 mAChR (SEQ ID NO:9) (FIG. 4B) were 
not statistically significant, they were intriguing in light of 
previous observations relating G-proteins to neoplasia. 

Each of the six genes described above was expressed in 
normal colon mucosa, as indicated by meir representation in 
colon cDNA libraries. To study expression of the genes in 
neoplastic colorectal epithelium, we employed reverse 
transcription-polymerase chain reaction (PCR) assays. 45 
Primers based on the sequences of PER, TBI, TB2, MCC, 
and APC were each used to design primers for PCR per- 
formed with cDNA templates. Each of these genes was 
found to be expressed in normal colon, in each of ten cell 
lines derived from colorectal cancers, and in tumor cell lines ^ 
derived from lung and bladder tumors. The ten colorectal 
cancer cell lines included eight from patients with sporadic 
GRC and two from patients with FAR 

EXAMPLE 2 

55 

This example demonstrates a genetic analysis of the role 
of the FER gene in FAP and sporadic colorectal cancers. 

We considered FER as a candidate because of its prox- 
imity to the FAP locus as judged by physical and genetic 
criteria (see Example 1), and its homology to known 60 
tyrosine kinases with oncogenic potential Primers were 
designed to PCR-amplify the complete coding sequence of 
FER from the RNA of two colorectal cancer cell lines 
derived from FAP patients. cDNA was generated from RNA 
and used as a template for PCR The primers used were 65 
5-AGAAGGATC£CITGTGCACTGTGGA-3 , (SEQ ID 
NO:95) and 5 , -GAC^G_GATCCTGAAGCTGAGTTTG-3 , 
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(SEQ ID NO:96). The underlined nucleotides were altered 
from the true FER sequence to create BamHI sites. The cell 
lines used were JW and Difi, both derived from colorectal 
cancers of FAP patients. (C. Paraskeva, B. G. Buckie, D. 

5 Sheer, C B. Wigley, Int J. Cancer 34, 49 (1984); M. E. 
Gross et al., Cancer Res. 5 1, 1452 (1991). The resultant 2554 
basepair fragments were cloned and sequenced in their 
entirety. The PCR products were cloned in the BamHI site 
of Bluescript SK (Stratagene) and pools of at least 50 clones 

10 were sequenced en masse using T7 polymerase, as described 
in Nigro et aL, Nature 342, 705 (1989). 

Only a single conservative amino acid change 
(GTG-4CTG, creating a val to leu substitution at codon 439) 
was observed The region surrounding this codon was then 

15 amplified from the DNA of individuals without FAP and this 
substitution was found to be a common polymorphism, not 
specifically associated with FAR Based on these results, we 
considered it unlikely (though still possible) the FER gene 
was responsible for FAR To amplify the regions surrounding 

20 codon 439, the following primers were used: 
5'-TCAGAAAGTGCrGAAGAG-3' (SEQ ID NO:97) and 
5MjGAATAAITAGGTCTCCAA-3 f (SEQ ID NO:98). PCR 
products were digested with PstI, which yields a 50 bp 
fragment if codon 439 is leucine, but 26 and 24 bp fragments 

25 if it is valine. The primers used for sequencing were chosen 
from the FER cDNA sequence in Hao et aL, supra. 

EXAMPLE 3 

30 This example demonstrates the genetic analysis of MCC, 
TB2, SRP and APC in FAP and sporadic colorectal tumors. 
Each of these genes is linked and encompassed by contig 3 
(see FIG. 1). 

Several lines of evidence suggested mat this contig was of 

35 particular interest First, at least three of the four genes in 
this contig were within the deleted region identified in two 
FAP patients. (See Example 5 infra.) Second, allelic dele- 
tions of chromosome 5q21 in sporadic cancers appeared to 
be centered in this region. (Ashton-Rickardt et aL, 

40 Oncogene, in press; and Miki et aL, Japn. J. Cancer Res., in 
press.) Some tumors exhibited loss of proximal RFLP mark- 
ers (up to and potentially including the 5* end of MCC), but 
no loss of markers distal to MCC Other tumors exhibited 
loss of markers distal to and perhaps including the 3' end of 

45 MCC, but no loss of sequences proximal to MCC. This 
suggested either that different ends of MCC were affected by 
loss in all such cases, or alternatively, that two genes (one 
proximal to and perhaps including MCC, the other distal to 
MCC) were separate targets of deletion. Third, clones from 

50 each of the six FAP region genes were used as probes on 
Southern blots containing tumor DNA from patients with 
Sporadic CRC. Only two examples of somatic changes were 
observed in over 200 tumors studied: a rearrangement/ 
deletion whose centromeric end was located within me 

55 MCC gene (Kinder et aL, supra) and an 800 bp insertion 
within the APC gene between nucleotides 4424 and 5584. 
Fourth, point mutations of MCC were observed in two 
tumors (Kinzler et aL) supra strongly suggesting that MCC 
was a target of mutation in at least some sporadic colorectal 

6o cancers. 

Based on these results, we attempted to search for subtle 
alterations of contig 3 genes in patients with FAP. We chose 
to examine MCC and APC, rather than TB2 or SRP, because 
of the somatic mutations in MCC and APC noted above. To 
65 facilitate the identification of subtle alterations, the genomic 
sequences of MCC and APC exons were determined (see 
Table I, SEQ ID NO:24-38). 
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TABLE I 
APC EXONS 



EXON 

NUCLEOTIDES 1 EXON BOUNDARY SEQUENCE 2 



822 to 930 catgat^tafctgtatttacctata^taaattatra . . . (SEQ ID NO: 24) 

. . . ACCAAGygtaacagaag att acaaacoctg^actaa tgccatg^tactttgctaag (SEQ ID NO: 25) 
931 to 1309 ggatattaaa^^aatm)mtctaaactcattt^<xcaca^GTGGAA . . . (SEQ ID NO: 26) 

. . . AITCAA/^gttctctatagtgtacatcgtagtgcatg (SEQ ID NO: 27) 
1310 to 1405 cateattgctettcaaataacaa ag<*t^^ . . . (SEQ ID NO: 28) 

. . . AACTAG/gtaagacaaaaatgttttttaatgacatagacaattactggtg (SEQ ID NO: 29) 
1406 to 1545 tagatgatt^tffltcctctt^^tttttaaattag/GGGGAC . . . (SEQ ID NO: 30) 

. . . AACAAGtotetttttataa^rfrtatt arta^ (SEQ ID NO: 31) 

1546 to 1623 gcttggcttcaagttgtotttttaatgato . . . (SEQ ID NO: 32) 

. . . CAGCAG/gtactatttagaamcacctimmctttt^ (SEQ ID. NO: 33) 

1624 to 1740 gcaactagtatgattttatgtata3attaafrtt . . . (SEQ ID NO: 34) 

. . . AAAAAG/gtacctttgaaaacatttagtactataatatgaatttcatgt (SEQ ID NO: 35) 
1741 to 1955 caactcmttagatgacccatattcagaaacttactag/GAAICA . . , (SEQ ED NO: 36) 

. . . C^CAGfaatatatajggttta^ (SEQ ID NO: 37) 

1956 to 8973 tcttgatttttatttcag/GCAAAT . . . (SEQ ID NO: 38) 

, . . GGTXrnAIWAAAAAAAAAIXnTTTICT (SEQ ID NO: 1) 

l Rclative to predicted translation initiation site 

2 SmaH case letters represent imrons, large case letters represent exons 

The entire 3* end of the cloned APC cDNA (ot 1956-8973) appeared to be encoded in this exon, as indicated by 
restriction endonuckase mapping and sequencing of the cloned genomic DNA. The ORF ended at nt 8535. The 
extreme 3' end of the APC transcript has not yet been identified 



These sequences were used to design primers for PCR 
analysis of constitutional DNA from FAP patients. 

We first amplified eight exons and surrounding introns of 
the MCC gene in affected individuals from 90 different FAP 3Q 
kindreds. The PCR products were analyzed by a ribonu- 
clease (RNase) protein assay* In brief, the PCR products 
were hybridized to in vitro transcribed RNA probes repre- 
senting the normal genomic sequences. The hybrids were 
digested with RNase A, which can cleave at single base pair 
mismatches within DNA-RNA hybrids, and the cleavage 35 
products were visualized following denaturing gel electro- 
phoresis. Two separate RNase protection analyses were 
performed for each exon, one with the sense and one with 
the antisense strand Under these conditions, approximately 
40% of all mismatches are detectable. Al&ough some amino 40 
acid variants of MCC were observed in FAP patients, all 
such variants were found in a small percentage of normal 
individuals. These variants were thus unlikely to be respon- 
sible for the inheritance of FAP. 

We next examined three exons of the A PC gene. The 45 
three exons examined included those containing nt 822-930, 
931-1309, and the first 300 nt of the most distal exon (nt 
1956-2256). PCR and RNase protection analysis were per- 
formed as described in Kinzler et aL supra, using the primers 
underlined in Table I (SEQ ID NO:24-38). The primers for 50 
nt 1956-2256 were 5M3CAAATCCTAAGAGAGAACAA- 
3' (SEQ ID NO:99) and S'-GATGGCAAGCITGAGCCAG- 
3'(SEQIDNO:1O0). 

In 90 kindreds, the RNase protection method was used to 
screen for mutations and in an additional 13 kindreds, the 55 
PCR products were cloned and sequenced to search for 
mutations not detectable by RNase protection. PGR products 
were cloned into a Bluescript vector modified as described 
in T. A. Holton and M, W. Graham, Nucleic Acids Res. 19, 
1156 (1991). A minimum of 100 clones were pooled and 60 
sequenced. Five variants were detected among the 103 
kindreds analyzed. Cloning and subsequent DNA sequenc- 
ing of the PCR product of patient P21 indicated a C to T 
transition in codon 413 that resulted in a change from 
arginine to cysteine. This amino acid variant was not 65 
observed in any of 200 DNA samples from individuals 
without FAP. Cloning and sequencing of the PCR product 
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from patients P24 and P34, who demonstrated the same 
abnormal RNase protection pattern indicated that both had a 
C to T transition at codon 801 that resulted in a change from 

30 arginine(CGA) to a stop codon (TGA). TMs change was not 
present in 200 individuals without FAR As this point muta- 
tion resulted in the predicted loss of the recognition site for 
the enzyme Taq I, appropriate PCR products could be 
digested with Taq I to detect the mutation. This allowed us 

35 to determine that the stop codon co-segragated with disease 
phenotype in members of the family of P24. The inheritance 
of this change in affected members of the pedigree provides 
additional evidence for the importance of the mutation. 

m Cloning and sequencing of the PCR product from FAP 
patient P93 indicated aCtoG transversion at codon 279, 
also resulting in a stop codon (change from TCA to TGA). 
This mutation was not present in 200 individuals without 
FAP. Finally, one additional mutation resulting in a serine 

45 (TCA) to stop codon (TGA) at codon 712 was detected in a 
single patient with FAP (patient P60). 

The five germline mutations identified are summarized in 
Table HA, as well as four others discussed in Example 9. 

50 TABLE HA 

Germline mutations of me AFC gene in FAP and GS Patients 



EXTRA- 
COLO- 



55 NIC NUCLEO- 
PAIIENT TIDE 
DISEASE CODON CHANGE 



AMINO 
ACID 

CHANGE AGE 



93 



279 



TCA->TGA 



Ser->Stop 39 Mandi- 
bular 



60 



Osteoma 



24 
34 



301 
301 



CGA->TGA 
CGA->TGA 



Arg->Stop 46 None 
Arg->Stop 27 Des- 
moid 



Tumor 



65 21 



413 



CGC->TGC 



Arg->Cys 24 Mandi- 
bular 
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TABLE IIA-continued 



Germline mutations of the APC gene in FAP and GS Patients 

EXTRA- 
C0LO- 

NIC NUCLEO- AMINO 

PATENT TIDE ACID 

DISEASE CODON CHANGE CHANGE AGE 



10 



60 712 TCA->TGA Ser->Stop 37 Mandi- 

bular 

Osteoma 



3746 243 CAGAG->CAG splice- 15 

junction 

3460 301 CGA->JGA Arg->Stop 

3827 456 CITTCA->CTTCA frameshift 

3712 500 T->G Tyr->Stop 

* Hie mutated nucleotides are underlined- 

la addition to these germline mutations, we identified sev- 
eral somatic mutations of MCC and APC in sporadic CRC s. 
Seventeen MCC exons were examined in 90 sporadic col- 
orectal cancers by RNase protection analysis. In each case 
where an abnormal RNase protection pattern was observed, 25 
the corresponding PCR products were cloned and 
sequenced. This led to the identification of six point muta- 
tions (two described previously) (Kinzler et al, supra), each 
of which was not found in the germline of these patients 
(Table HB). ^ 

TABLE HB 



Somatic Mutations in Sporadic CRC Patients 









AMINO 






NUCLEOTIDE 


ACID 


PATIENT 


CODON 1 


CHANGE 


CHANGE 


135 


MCC 12 


GAG/gtaaga-> 


(Splice 






GAG/gtaaaa 


Donor) 


T16 


MCC 145 


cteag/GGA-> 


(Splice 






atcag/GGA 


Acceptor) 


T47 


MCC 267 


OGG-XTO 


Arg->Leu 


T81 


MCC 490 


TCG->TTG 


Ser->Leu 


135 


MCC 506 


OGG->CAG 


Arg->Gln 


191 


MCC 698 


GCToCfrT 


Ala->Val 


134 


APC 288 


ccagtScccagccagt 


(Insertion) 


T27 


APC 331 


CGA->TGA 


Arg->Stop 


1135 


APC 437 


CAA/gtaa->CAA/gcaa 


(Splice 








Donor) 


T20I 


APC 1338 


cag->t\g 


GIn->Stop 



35 



40 



45 



For splice site mutations, the codon nearest to the mutation is listed 

The underlined nucleotides were mutant; small case letters represent introos, 50 

large case letters represent exons 

Four of the mutations resulted in amino acid substitutions 
and two resulted in the alteration of splice site consensus 
elements. Mutations at analogous splice site positions in 
other genes have been shown to alter RNA processing in 55 
vivo and in vitro. 

Three exons of APC were also evaluated in sporadic 
tumors. Sixty tumors were screened by RNase protection, 
and an additional 98 tumors were evaluated by sequencing. 
The exons examined included nt 822-930, 931-1309, and 60 
1406-1545 (Table I). A total of three mutations were 
identified, each of which proved to be somatic. TYimor T27 
contained a somatic mutation of CGA (arginine) to TGA 
(stop codon) at codon 33. Tumor T135 contained a GT to GC 
change at a splice donor site. Tumor T34 contained a 5 bp 65 
insertion (CAGCC between codons 288 and 289) resulting 
in a stop at codon 291 due to a frameshift. 
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We serendipitously discovered one additional somatic 
mutation in a colorectal cancer. During our attempt to define 
the sequences and splice patterns of the MCC and APC gene 
products in colorectal epithelial cells, we cloned cDNAfrom 

5 the colorectal cancer cell line SW480. Hie amino acid 
sequence of the MCC gene from SW480 was identical to 
that previously found in clones from human brain. The 
sequence of APC in SW480 cells, however, differed 
significantly, in that a transition at codon 1338 resulted in a 

10 change from glutamine (CAG) to a stop codon (TAG). To 
determine if this mutation was somatic, we recovered DNA 
from archival paraffin blocks of the original surgical speci- 
men (T201) from which the tumor cell line was derived 28 
years ago. 

15 DNA was purified from paraffin sections as described in 
S. E. Goelz, S. R. Hamilton, and B, Vogelstein. Biochem 
Biophys. Res. Comm. 130, 118 (1985). PCR was performed 
as described in reference 24, using the primers 
5 , -GT^CCAGCAGTGTCACAG-3 , (SEQ ID NO: 101) and 

20 5M3GGAGXTITCGCrCCTGA-3' (SEQ ID NO:102). A 
PCR product containing codon 1338 was amplified from the 
archival DNA and used to show that the stop codon repre- 
sented a somatic mutation present in the original primary 
tumor and in cell lines derived from the primary and 

25 metastatic tumor sites, but not from normal tissue of the 
patient 

The ten point mutations in the MCC and APC genes so far 
discovered in sporadic CRCs are summarized in Table TTR. 
Analysis of the number of mutant and wild-type PCR clones 

30 obtained from each of these tumors showed that in eigjht of 
the ten cases, the wild-type sequence was present in approxi- 
mately equal proportions to the mutant This was confirmed 
by RFLP analysis using flanking markers from chromosome 
5q which demonstrated that only two of the ten tumors 

35 (T135 and T201) exhibited an allelic deletion on chromo- 
some 5q. These results are consistent with previous obser- 
vations showing that 20-40% of sporadic colorectal tumors 
had allelic deletions of chromosome 5q. Moreover, these 
data suggest mat mutations of 5q21 genes are not limited to 

40 those colorectal tumors which contain allelic deletions of 
this chromosome. 

EXAMPLE 4 

This example characterizes small, nested deletions in 
DNA from two unrelated FAP patients. 

DNA from 40 FAP patients was screened with cosmids 
that has been mapped into a region near the APC locus to 
identify small deletions or rearrangements. Two of these 
cosmids, L5.71 =nd L5.79, hybridized with a 1200 kb NotI 
fragment in DNAs from most of the FAP patients screened. 

The DNA of one FAPpatient, 3214, showed only a 940 kb 
NotI fragment instead of the expected 1200 kb fragment 
DNA was analyzed from four other members of the patient's 
immediate family; the 940 kb fragment was present in her 
affected mother (4711), but not in the other, unaffected 
family members. The mother also carried a normal 1200 kb 
NotI fragment that was transmitted to her two unaffected 
offspring. These observations indicated that the mutant 
polyposis allele is on the same chromosome as the 940 Id) 
NotI fragment A simple mterpretation is that APC patients 
3214 and 4711 each carry a 260 kb deletion within the APC 
locus. 

If a deletion were present then other enzymes might also 
be expected to produce fragments with altered mobilities. 
Hybridization of L5.79 to Nrul-digested DNAs from both 
affected members of the family revealed a novel Nrul 
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fragment of 1300 kb, in addition to the normal 1200 kb Nrul 
fragment Furthermore, MM fragments in patients 3214 and 
4711 also showed an increase in size consistent with the 
deletion of an Mlul site. The two chromosome 5 homologs 
of patient 32 14 were segregated in somatic cell hybrid lines; 5 
HHW1155 (deletion hybrid) carried the abnormal homolog 
and HHW1159 (normal hybrid) carried the normal homolog. 

Because patient 8214 showed bray a 940 kb NotI 
fragment, she had not inherited the 1200 kb fragment present 
in the unaffected father's DNA. This observation suggests 1Q 
that he must be heterozygous for, and have transmitted, 
either a deletion of the L5.79 probe region or a variant NotI 
fragment too large to resolve on the gel system* As expected, 
the hybrid cell line HHW1159, which carries the paternal 
homolog, revealed no resolved Not fragment when probed 
with L5.79. However, probing of HHW1159 DNA with 15 
L5.79 following digestion with other enzymes did reveal 
restriction fragments, demonstrating the presence of DNA 
homologous to the probe. The father is, therefore, inter- 
preted as heterozygous for a polymorphism at the NotI site, 
with one chromosome 5 having a 1200 kb NotI fragment and 20 
the other having a fragment too large to resolve consistently 
on the geL The latter was transmitted to patient 3214. 

When double digests were used to order restriction sites 
within the 1200 kb NotI fragment, L5.71 and L5.79 were 
beth found to lie on a 550 kb Notl-Nrul fragment and, 25 
therefore, on the same side of an Nrul site in the 1200 kb 
NotI fragment To obtain genomic representation of 
sequences present over the entire 1200 Kb NotI fragment, we 
constructed a library of small-fragment inserts enriched for 
sequences from this fragment DNA from the somatic cell ^ 
hybrid HHW141, which contains about 40% of chromosome 
5, was digested with Not! and electrophoresed under pulsed- 
field gel (PFG) conditions; EcoRI fragments from the 1200 
kb region of mis gel were cloned into a phage vector: Probe 
Map30 was isolated from this library. In normal individuals 35 
probe Map30 hybridizes to the 1200 kb NotI fragment and 
to a 200 kb Nrul fragment This latter hybridization places 
Map30 distal, with respect to the locations of L5.71 and 
L5 .79, to the Nrul site of the 550 kb Notl-Nrul fragment 

Because Map30 hybridized to the abnormal, 1300 kb Nrul 40 
fragment of patient 3214, the locus defined by Map30 Ees 
outside the hypothesized deletion. Furthermore, in normal 
chromosomes Map30 identified a 200 kb Nrul fragment and 
L5.79 identified a 1200 kb Nrul fragment; the hypothesized 
deletion must, therefore, be removing an Nrul site, or sites, 45 
lying between Map30 and L5.79, and these two probes must 
flank the hypothesized deletion. A restriction map of the 
genomic region, showing placement of these probes, is 
shown in FIG. 5. 

A NotI digest of DNA from another FAP patient, 3824, 50 
was probed with L5.79. In addition to the 1200 kb normal 
NotI fragment, a fragment of approximately 1100 kb was 
observed, consistent with the presence of a 100 kb deletion 
in one chromosome 5. In this case, however, digestion with 
NruIandNQuI did notrevealabnonnMbands,mdicating that 55 
if a deletion were present, its boundaries must He distal to the 
Nrul and Mlul sites of the fragments identified by L5.79. 
Consistent with this expectation, hybridization of Map30 to 
DNA from patient 3S24 identified a 760 kb Mlul fragment 
in addition to the expected 860 kb fragment, supporting the 60 
interpretation of a 100 kb deletion in mis patient The two 
chromosome 5 homologs of patient 3824 were segregated in 
somatic cell hybrid lines; HHW1291 was found to carry 
only the abnormal homolog and HHW1290 only the normal 
homolog. (5 

That the 860 kb Mlul fragment identified by Map30 is 
distinct from the 830 kb Mlul fragment identified previously 
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by L5.79 was demonstrated by hybridization of Map30 and 
L5.79 to a Notl-Mul double digest of DNAfrom the hybrid 
cell (HHW1159) containing the nondeleted chromosome 5 
homolog of patient 3214 As previously indicated, this 

5 hybrid is interpreted as missing one of the NotI sites that 
define the 1200 kb fragment A 620 kb Notl-Miul fragment 
was seen with probe L5.79, and an 860 Kb fragment was 
seen with Map30. Therefore, the 830 kb MM fragment 
recognized by probe L5.79 must contain a NotI site in 

10 HHW1159 DNA; because the 860 1* Mlul fragment 
remains intact, it does not carry this NotI site and must be 
distinct from the 830 kb Mlul fragment 

EXAMPLE 5 

15 This example demonstrates the isolation of human 
sequences which span the region deleted in the two unre- 
lated FAP patients characterized in Example 4. 

A strong prediction of the hypothesis that patients 8214 
and 3824 carry deletions is that some sequences present on 

20 normal chromosome 5 homologs would be missing from the 
hypothesized deletion homologs. Therefore, to develop 
genomic probes that might confirm the deletions, as well as 
to identify genes from the region, YAC clones from a contig 
seeded by cosmid L5.79 were localized from a library 

25 containing seven haploid human genome equivalents 
(Albertsen et aL, Proa NatL Acad. Sci. U.SA., Vol. 87, pp. 
4256-4260 (1990)) with respect to the hypothesized dele- 
tions. Three clones, YACs 57B8, 310D8, and 183H12, were 

^ found to overlap the deleted region. 

Importantly, one end of YAC 57B8 (clone AT57) was 
found to lie within the patient 3214 deletion. Inverse poly- 
merase chain reaction (PGR) defined the end sequences of 
the insert of YAC 57B8. PCR primers based on one of these 

35 end sequences repeatedly failed to amplify DNA from the 
somatic cell hybrid (HHW1155) carrying the deleted 
homolog of patient 3214, but did amplify a product of the 
expected size from the somatic cell hybrid (HHW1159) 
carrying the normal chromosome 5 homolog. This result 

^ supported the interpretation mat the abnormal restriction 
fragments found in the DNA of patient 3214 result from a 
deletion. 

Additional support for the hypothesis of deletion in DNA 
from patient 3214 came from subcloned fragments of YAC 

45 183H12, which spans the region in question. Yll, an EcoRI 
fragment cloned from YAC 183H12, hybridized to the 
normal, 1200 kb Notffragment of patient 4711, but failed to 
hybridize to the abnormal, 940 Kb NotI fragment of 47 1 1 or 
to DNA from deletion cell line HHW1155. This result 

50 confirmed the deletion in patient 3214. 

Two additional EcoRI fragments from YAC 183H12, Y10 
and Y14, were localized within the patient 3214 deletion by 
their failure to hybridize to DNA from HHW1155. Probe 
Y10 hybridizes to a 150 kb Nrul fragment in normal 

55 chromosome 5 homologs. Because the 3214 deletion creates 
the 1300 Kb Nrul fragment seen with the probes L5.79 and 
Map30 that flank the deletion, these Nrul sites and the 150 
kb Nrul fragment lying between must be deleted in patient 
3214. Furthermore, probe Y10 hybridizes to the same 620 kb 

6o Notl-Miul fragment seen with probe L5.79 in normal DNA, 
indicating its location as L5.79-proximal to the deleted Mlul 
site and placing it between the Mlul site and the L5.79- 
proximal Nrul site. The Mlul site must, therefore, lie 
between the Nrul sites that define the 150 Kb Nrul fragment 

65 (see FIG. 5). 

Probe Yll also hybridized to the 150 Kb Nrul fragment in 
the normal chromosome 5 homolog, but failed to hybridize 
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to the 620 kb Notl-Mlul fragment, placing it L5.79-distal to 
the Mlul site, but proximal to the second Nrul site. Hybrid- 
ization to the same (860 kb) Mlul fragment as Map30 
confirmed the localization of probe Yll L5.79-distal to the 
Mlul site. 5 

Probe Y14 was shown to be L5.79-distal to both deleted 
Nrul sites by virtue of its hybridization to the same 200 kb 
Nrul fragment of the normal chromosome 5 seen with 
Map30. Therefore, the order of these EcoRI fragments 
derived from YAC 183H12 and deleted in patient 3214, with 
respect to L5.79 and Map30, is L5.79-Y10-Y11-Y14- 
Map30. 

Hie 100 kb deletion of patient 3824 was confirmed by the 
failure of aberrant restriction fragments in this DNA to 
hybridize with probe Yll, combined with positive hybrid- 15 
izations to probes Y10 and/or Y14. Y10 and Y14 each 
hybridized to the 1100 kb NotI fragment of patient 3824 as 
well as to the normal 1200 kb NotI fragment, but Yll 
hybridized to the 1200 kb fragment only. In the Mlul digest, 
probe Y14 hybridized to the 860 kb and 760 kb fragments 20 
of patient 3824 DNA, but probe Yll hybridized only to the 
860 kl3 fragment We conclude that the basis for the 
alteration in fragment size in DNA from patient 3824 is, 
indeed, a deletion. Furthermore, because probes Y10 and 
Y14 are missing from the deleted 3214 chromosome, but 25 
present on the deleted 3824 chromosome, and they have 
been shown to flank probe Yll, the deletion in patient 3824 
must be nested within the patient 3214 deletion. 

Probes Y10, Yll, Y14 and Map30 each hybridized to ^ 
YAC 310D8, indicating mat this YAC spanned the patient 
3824 deletion and at a minimum, most of the 3214 deletion. 
The YAC characterizations, therefore, confirmed the pres- 
ence of deletions in the patients and provided physical 
representation of the deleted region. 35 

EXAMPLE 6 

This example demonstrates that the MCC coding 
sequence maps outside of the region deleted in the two FAP 
patients characterized in Example 4. 40 

An intriguing FAP candidate gene, MCC, recently was 
ascertained with cosmid L5.71 and was shown to have 
undergone mutation in colon carcinomas (Kinzler et aL, 
supra). It was therefore of interest to map this gene with 
respect to the deletions in APC patients. Hybridization of 45 
MCC probes with an overlapping series of YAC clones 
extending in either direction from L5 .71 showed that the 3* 
end of MCC must be oriented toward the region of the two 
APC deletions. 

Therefore, two 3 ! cDNA clones from MCC were mapped 50 
with respect, to the deletions: done 1CX (bp 2378-4181) and 
clone 7 (bp 2890-3560). Clone 1CX contains sequences from 
the C-terminal end of the open reading frame, which stops 
at nucleotide 2708, as well as 3* untranslated sequence, 
done 7 contains sequence mat is entirely 3' to the open 55 
reading frame. Importantly, the entire 3* untranslated 
sequence contained in the cDNA dones consists of a single 
2.5 kb exon. These two dones were hybridized to DNAs 
from the YACs spanning the FAP region. Clone 7 fails to 
hybridize to YAC 310D8, although it does hybridize to €0 
YACs 183H12 and 57B8; the same result was obtained with 
the cDNA Id Furthermore, these probes did show hybrid- 
ization to DNAs from both hybrid cell lines (HWW1159 and 
HWW1155) and the lymphoblastoid cell line from patient 
3214, confirming their locations outside the deleted region. 65 
Additional mapping experiments suggested that the 3* end of 
the MCC cDNA clone contig is likdy to be located more 
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than 45 fcb from the deletion of patient 3214 and, therefore, 
more than 100 kb from the deletion of patient 3824. 

EXAMPLE 7 

5 This example identifies three genes within the deleted 
region of chromosome 5 in the two unrelated FAP patients 
characterized in Example 4. 
Genomic clones were used to screen cDNA libraries in 

10 three separate experiments. One screening was done with a 
phage clone derived from YAC 310D8 known to span the 
260 kb deletion of patient 3214. A large-insert phage library 
was constructed from this YAC; screening with Yll iden- 
tified A205, which mapped within both deletions. When 

15 clone X205 was used to probe a random-, plus oligo(dT>, 
primed fetal brain cDNA library (approximately 300,000 
phage), six cDNA clones were isolated and each of them 
mapped entirely within both deletions. Sequence analysis of 
these six clones formed a single cDNA contig, but did not 

23 reveal an extended open reading frame. One of the six 
cDNAs was used to isolate more cDNA clones, some of 
which crossed the L5 Jl-proximal breakpoint of the 3824 
deletion, as indicated by hybridization to both chromosome 
of mis patient These clones also contained an open reading 

2$ frame, indicating a transcriptional orientation proximal to 
distal with respect to L5.71. This gene was named DPI 
(deleted in polyposis 1). This gene is identical to TB2 
described above. 
cDNA walks yielded a cDNA contig of 3.0-3.5 kb, and 

30 included two clones containing terminal poly(A) sequences. 
This size corresponds to the 3.5 kb band seen by Northern 
analysis. Sequencing of the first 3 163 bp of the cDNAconog 
revealed an open reading frame extending from the first base 
to nucleotide 631, followed by a 25 kb J untranslated 

35 region. The sequence surrounding the methionine codon at 
base 77 conforms to the Kozak consensus of an initiation 
methionine (Kozak, 1984). Failed attempts to walk farmer, 
coupled with the similarity of the lengths of isolated cDNA 
and mRNA, suggested that the 3OT 2 -terrninus of me DPI 

40 protein had been reached. Hybridization to a combination of 
genomic and YAC DNAs cut with various enzymes indi- 
cated the genomic coverage of DPI to be approximately 30 
kb. 

Two additional probes for the locus, YS-11 and YS-39, 

45 which had been ascertained by screening of a cDNA library 
with an independent YAC probe identified with MCC 
sequences adjacent to L5.71, were mapped into the deletion 
region. YS-39 was shown to be a cDNA identical in 
sequence to DPI. Partial characterization of YS-11 had 

50 shown that 200 bp of DNA sequence at one end was 
identical to sequence coding for the 19 kd protein of the 
ribosomal signal recognition particle, SRP19 (Lingelbach et 
aL, supra). Hybridization experiments mapped YS-11 within 
beth deletions. The sequence of this clone, however, was 

55 found to be complex. Although 454 bp of the 1032 bp 
sequence of YS-11 were identical to the GenBank entry for 
the SRP19 gene, another 578 bp appended 5' to the SRP19 
sequence was found to consist of previously unreported 
sequence containing no extended open reading frames. This 

€0 suggested that YS-1 1 was either a chimeric done containing 
two independent inserts or a clone of an incompletely 
processed or aberrant message. If YS-11 were a conven- 
tional chimeric clone, the independent segments would not 
be expected to map to the same physical region. The 

65 segments resulting from anomalous processing of a continu- 
ous transcript, however, would map to a single chromosomal 
region. 
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Inverse PCR with primers specific to the two ends of 
YS-11, the SRP19 ,end and the unidentified region, verified 
that both sequences map within the YAC 310D8; therefore, 
YS-11 is most likely a clone of an immature or anomalous 
mRNA species. Subsequently, both ends were shown to lie 5 
with the deleted region of patient 3824, and YS-11 was used 
to screen for additional cDNA clones. 

Of the 14 cDNA clones selected from the fetal brain 
library, one clone, V5, was of particular interest in that it 
contained an open reading frame throughout, although it 10 
included only a short identity to the first 78 5' bases of the 
YS-11 sequence. Following the 78 bp of identical sequence, 
the two cDNA sequences diverged at an AG. Furthermore, 
divergence from genomic sequence was also seen after these 
78 bp, suggesting the presence of a splice junction, and * 5 
supporting the view that YS-11 represents an irregular 
message. 

Starting with V5, successive 5* and 3 f walks were per- 
formed; the resulting cDNA contig consisted of more than 
100 clones, which defined a new transcript, DP2. Qones 20 
walking in the 5' direction crossed the 3824 deletion break- 
point farthest from L5.71; since its 3' end is closer to this 
cosmid than its 5' end, the transcriptional orientation of DP2 
is opposite to mat of MCC and DPI. 

The third screening approach relied on hybridization with 
a 120 Kb Mlul fragment from YAC 57B8. This fragment 
hybridizes with probe Y 11 and completely spans the 100 kb 
deletion in patient 3824. the fragment was purified on two 
preparative FFGs, labeled, and used to screen a fetal brain ^ 
cDNA library. A number of cDNA clones previously iden- 
tified in the development of the DPI and DP2 contigs were 
reascertained. However, 19 new cDNA clones mapped into 
the patient 3824 deletion. Analysis indicated mat these 19 
formed a new contig, DP3, containing a large open reading 35 
frame. 

A clone from the 5* end of this new cDNA contig 
hybridized to the same EcoRI fragment as the 3' end of DP2. 
Subsequently, the DP2 and DP3 contigs were connected by 
a single 5* walking step from DP3, to form the single contig 40 
DP2.5. The complete nucleotide sequence of DP2.5 is 
shown in FIG. 9. 
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The consensus cDNA sequence of DP2.5 suggests that the 
entire coding sequence of DP2.5 has been obtained and is 
8532 bp long. The most 5 ! ATG codon occurs two codons 
from an in-frame stop and conforms to the Kozak initiation 

5 consensus (Kozak, NucL Acids. Res,, Vol. 12, p. 857-872 
1984). The 3 f open reading frame breaks down over the final 
1.8 kb, giving multiple stops in all frames. A poly(A) 
sequence was found in one clone approximately 1 kb into the 
3' untranslated region, associated with a polyadenylation 

10 signal 33 bp upstream (position 9530). The open reading 
frame is almost identical to that identified as APC above. 

An alternatively spliced exon at nucleotide 934 of the 
DP2.5 transcript is of potential interest it was first discov- 
ered by noting that two classes of cDNA had been isolated. 

15 The more abundant cDNA class contains a 303 bp exon not 
included in the other. The presence in vivo of the two 
transcripts was verified by an exon connection experiment 
Primers flanking the alternatively spliced exon were used to 
amplify, by PCR, cDNA prepared from various adult tissues. 

20 Two PCR products that differed in size by approximately 
300 bases were amplified from all the tissues tested; the 
larger product was always more abundant man the smaller. 
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This example demonstrates the primers used to identify 
subtle mutations in DPI, SRP19, and DP25. 

To obtain DNA sequence adjacent to the exons of the 
genes DPI, DP2.5, and SRP19, sequencing substrate was 
30 obtained by inverse PCR amplification of DNAs from two 
YACs, 3 10D8 and 183H12, mat spaa the deletions. Ligation 
at low concentration cyclized the restriction enzyme- 
digested YAC DNAs. Oligonucleotides with sequencing 
tails, designed in inverse orientation at intervals along the 
35 cDNAs, primed PCR amplification from the cyclized tem- 
plates. Comparison of these DNA sequences with the cDNA 
sequences placed exon boundaries at the divergence points. 
SRP19 and DPI were each shown to have five exons. DP2.5 
consisted of 15 exons. The sequences of the oligonucleotides 
40 synthesized to provide PCR amplification primers for the 
exons of each of these genes are listed in Table IH SEQ ID 
NO:39-94. 

TABLE in 

Sequences of Primers Used for SSCP Analyses 
Exon Primer 1 Primer 2 

DPI 

UP-TCCCCGCCTGCCGCTCTC RP-GCAGCGGCGGCTCCCGTG 
UP-GTGAACGGCKTTCA3GCTGC RP-ACGTGCGGGGAGGAAK3GA 
UP-AIGAtAimTACC^AAlGATAJAC RP-TDgTCCIA Cl ' lC ' 1 ' lClX rACAG 
UP-TACCCAIGCTGGCIUl'ril'lC RP-TOGGGCX^TCTrGFriCCrGA 
UP-ACAnAGGCACAAAGCTTGCAA RP-ATCAAGCTCCAGTAAGAAGGTA 

SRP19 

UP -TGCG GCKXrTGGCjl'lUl'lG RP-GCCCCirCCTTTCTGAGGAC 
UP-TTTIXnCCTGCCTCrTACTGC RP-ATGACAOCCOCC/aTCCCTC 
UP^CACTIAAAGCACAIAIArrrAGT RP-<nAIGGAAAATAGTGAAGAACC 
UP-rnrrTAAGTCCrCjl'Il'l'lUI'I'I'lG RP-TITAGAACClTlTlIGflGTIGTG 
UP^riCAGATMACACrAAGCCIAAC RP<^lGTCTCTIACAGTAGriACCA 

DP23 



UP-AGGTCCAAGGGIAGCCAAGG* RP-TAAAAATGGATAAACTACAAnAAAAG 

tlP-AAATACAGAAICAIGTCTTGAAGT RP-ACACCX\AAGAIXjACAAITIGAG 

UP-TAACTTAG ATAGC^GTAATITCCC * RP-ACAATAAACIGGAGTACACAAGG 

UP-AIAGGTCATTGCl'lUl'lGCrGAr* RP-TGAATITIAAIGGXrTACCTAGGT 

UP-CnTTnTGCTTTIACTGArTAACG RP-TCTAATICAriTIATrXXnAAIACXnC 

UP^KHAGCCAIAGTAIGAITATTTCr RP^HACCBOTITIAX^CCCACAAAC 
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TABLE El-continued 



Sequences of Rimers Used for SSCP Analyses 
Exon Primer 1 Primer 2 



UP-AAGAAAGCCTACACCAITnTGC 

UP-ACCT^GTCTAAAITAIACCArC 

UP-AGTCCnAAITITGTITCIAAACrC 

UP-TCATTCACTCACAGCCTGATGAC* 

XJP-AAACATCAI TGCrCr rC AAArA AC 

XJPUjATCAnGICll'l'l'lCClUl'llGC 

up-iiiiAAAiXjAixxriciArrcTGr^ 

UP-TITCTXncnACrGCnAGCAIT 
UP-TAGATGACCCATATIUICriTC 

3-A UP-GTTACIXjCArACACATKjTGAC 

-B UP-AGTACAAGGATGCCAAMTIMXj* 

-C UP- ATITG AATACTACAGTGTTACCC * 

-D UP^TOCCCAIACACAITCAAACAC* 

-E UP-AGTCTIAAATA1TCAGATGAGCAG* 

-F UP-AAGOCTACCAATIA1AGTGAACG* 

-G UP-AAGAAACAATACAGACTIATTGTG* 

-H UPAICICOCTCCAAAAGTGGTGC* 

-I UP-AGIAAATGCIGCAGTrCAGAGG* 

-J UP-CCCAGACTGCTTCAAAATTACC* 

-K UP4XCTTCAAATGAGTIAGCIGC* 

-L UP-ACCCAACAAAAATCAGTIAGATG* 

-N UP-AIGATGTlGACCnTCCAGGG* 

-M UP-AAAGACA3ACCAGACAGAGGG* 

-O UP-AAGATGACCIXTITGCAGGAATG* 

-P UP-CAATAGTAAGTAGTTIACATCAAG* 

-Q UP-CAGCCCCTTCAAGCAAACArC* 

-R UP-CAGTCTCCTGGCCGAAACTC* 

-S UP-TGG1AATGGAGCCAAIAAAAAGG* 

-T UP-TGTCTICIAICC ACAC AITCGrrC * 

-U UP-GGAGAAGAACTGGAAGTICATC* 

-V UP-TXTICCCACAGG1AAIACTCCC 

-W UP-CAGGACAAAATAATXXTGTCCC 



All primers are read in the 5* to 3' direction, me first primer in each pair lies 5' of the exon it amplifies: the 
second primer lies 3* of the exon it amplifies. Primers that lie within the exon are identified by an asterisk. 
UP represents the -21M13 universal primer sequence; 
RP represents the M13 reverse primer sequence. 

With the exception of exons 1, 3, 4, 9, and 15 of DP2.5 (see 
below), the primer sequences were located in intron 
sequences flanking the exons. The 5 1 primer of exon 1 is 40 
complementary to the cDNA sequence, but extends just into 
the 5* Kozak consensus sequence for the initiator 
methionine, allowing a survey of the translated sequences. 
The 5' primer of exon 3 is actually in the 5' coding sequences 
of this exon, as three separate intronic primers simply would 45 
not amplify. The 5' primer of exon 4 just overlaps the 5' end 
of this exon, and we thus fail to survey the 19 most 5* bases 
of this exon. For exon 9, two overlapping primer sets were 
used, such that each had one end within the exon. For exon 
15, the large 3* exon of DP2.5, overlapping primer pairs 50 
were placed along the length of the exon; each pair amplified 
a product of 250-400 bases. 

EXAMPLE 9 

This example demonstrates the use of single stranded 55 
conformation polymorphism (SSCP) analysis as described 
by Orita et aL Proc. NatL Acad- Sd. U.S A., Vol. 86, pp. 
2766-70 (1989) and Genomics, VoL 5, pp. 874-879 (1989) 
as applied to DPI, SRP19 and DP2.5. 

SSCP analysis identifies most single- or multiple-base 60 
changes in DNA fragments up to 400 bases in length. 
Sequence alterations are detected as shifts in electrophoretic 
mobility of single-stranded DNA on nondenaturing acryla- 
mide gels; the two complementary strands of a DNA seg- 
ment usually resolve as two SSCP conformers of distinct 65 
mobilities. However, if the sample is from an individual 
heterozygous for a base-pair variant within the amplified 



RP^JATCAnumGAACCAICrTGC 

RP43TCATGGCATTACTGACCAG 

RP-TGAAGGACTCCGATTTCACCC* 

RP<3CTITGAAACATGCACTACGAT 

RP-TACCATGATITAAAAATCCACCAG 

RP-CTGAGCTAKHTAAGAAATACATG 

RP-ACAGAGTCAGACCCTCCCTCAAAG 

RP-ATACACAGGTAAGAAATTAGGA 

RP^AATTAGGTCTrnTGAGAGTA 

KP<jC11111GTTTCXjIAACATGAAG* 

RP-ACTTCIATCTrrrrCAGAACGAG* 

RP-CTKHAITCTAAITIXjGCATAAGG 

RP-TGTITGCGTCTTGCCCATCTT* 

RP<nTIUIX7ITCATTATACT 

RP-AGCTGATGACAAAGATGATAATC* 

RP-ATGAGTGGGGTCTCCTGAAC* 

RP-TCCATCTGGAGTACTTIUlXnG* 

RP-CCGTGGCATAIX^TCCCCC* 

RP^AGCCTCATCTGTACTIGTGC* 

RP-TTCTGGTAIAGGTITIACTGGTG* 

RP-GTGGCTGGTAACTITAGCCTC* 

RP-ATIGIGTAACTITrCAIC AGTTGC * 

RP<TITITITGGCATIXKX5GAGCT* 

RP<jAATCAGACCAAGCTTGTCTAGAT* 

RP-AAACAGGACTTGIACTGTAGGA* 

RP-GAGGACTIAITCCATrrCIACC* 

RP-GTTGACTGGCGTIACTAATACAG* 

RP-TCGGACTITrCGCCATCCAC* 

RP-AICTITITCATC CTCACT ITITGC* 

RP-TIGAATCTITAATCrITrGGATl^ 

RP-GCT\CAACTGAATCGGGTACG 

RP-AlTllUl'iACTITCATrCTrcCTC 



segment, often three or more bands are seen. In some cases, 
even the sample from a homozygous individual will show 

40 multiple bands, Base-pair-change variants are identified by 
differences in pattern among the DNAs of the sample set 
Exons of the candidate genes were amplified by PGR 
from the DNAs of 61 unrelated FAP patients and a control 
set of 12 normal individuals. The five exons from DPI 
revealed no unique conf ormers in the FAP patients, although 

45 common conf ormers were observed with exons 2 and 3 in 
some individuals of both affected and control sets, indicating 
the presence of DNA sequence polymorphisms. Likewise, 
none of the five exons of SRP19 revealed unique conf ormers 
in DNA from FAP patients in the test panel. 

so Testing of exons 1 through 14 and primer sets A through 
N of exon 15, of the DP2 5 gene, however, revealed variant 
conf ormers specific to FAP patients in exons 7, 8, 10, 11, and 
15. These variants were in the unrelated patients 3746, 3460, 
3827, 3712, and 3751, respectively. The PCR-SSCP proce- 

5 5 dure was repeated for each of these exons in the five affected 
individuals and in an expanded set of 48 normal controls. 
The variant bands were reproducible in the FAP patients but 
were not observed in any of the control DNA samples. 
Additional variant conf ormers in exons 11 and 15 of the 
DP2.5 gene were seen; however, each of these was found in 

60 bom the affected and control DNA sets. The five sets of 
conf ormers unique to the FAP patients were sequenced to 
determine the nucleotide changes responsible for their 
altered mobilities. The normal confonners from the host 
individuals were sequenced also. Bands were cut from the 

65 dried acrylamide gels, and the DNA was eluted. PGR 
amplification of these DNAs provided template for sequenc- 
ing. 
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The sequences of the unique conformers from exQns 7, 8, 
10, and 11 of DP2.5 revealed dramatic mutations in the 
DP2.5 gene. The sequence of the new mutation creating the 
exon 7 conformer in patient 3746 was shown to contain a 
deletion of two adjacent nucleotides, at positions 730 and 5 
731 in the cDNA sequence (FIG. 7, SEQ ID NO:l). The 
normal sequence at mis splice junction is CAGG GTCA 
(intronic sequence underlined), with the intron-exon bound- 
ary between the two repetitions of AG. Hie mutant allele in 
this patient has the sequence CAGGTCA Although this 
change is at the 5' splice site, comparison with known 
consensus sequences of splice junctions would suggest mat 
a functional splice junction is maintained. If mis new splice 
junction were functional, the mutation would introduce a 
frameshift that creates a stop codon 15 nucleotides down- 
stream. If the new splice junction were not functional, 
messenger processing would be significantly altered. 

To confirm the 2-base deletion, the PGR product from 
FAP patient 3746 and a control DNA were electrophoresed 
on an acrylamide-urea denaturing gel, along with the prod- ^ 
ucts of a sequencing reaction. The sample from patient 3746 
showed two bands differing in size by 2 nucleotides, with the 
larger band identical in mobility to the control sample; this 
result was independent confinnation that patient, 3746 is 
heterozygous for a 2 bp deletion. ^ 

The unique conformer found in exon 8 of patient 3460 
was found to carry a C-T transition, at position 904 in the 
cDNA sequence of DP2.5 (shown in FIG. 7), which replaced 
the normal sequence of CGA with TGA. This point 
mutation, when read in frame, results in a stop codon 
replacing the normal arginine codon. This single-base 
change had occurred within the context of a CG dimer, a 
potential hot spot for mutation (Barker et al., 1984). 

The conformer unique to FAPpatient 3827 in exon 10 was 
found to contain a deletion of one nucleotide (1367, 1368, or 35 
1369) when compared to the normal sequence found in the 
other bands on the SSCP geL This deletion, occurring within 
a set of three T's, changed the sequence from CITTCA to 
CTiCA; mis 1 base frameshift creates a downstream stop 
within 30 bases. The PCR product amplified from tins 40 
patient's DNA also was electrophoresed on an acrylamide- 
urea denaturing gel, along with the PCR product from a 
control DNA and products from a sequencing reaction. The 
patient's PCR product showed two bands differing by 1 bp 
in length, with the larger identical in mobility to the PCR 45 
product from me normal DNA; this result confirmed the 
presence of a 1 bp deletion in patient 3827. 

Sequence analysis of the variant conformer of exon 11 
from patient 3712 revealed the substitution of a T by a G at 
position changing the normal tyrosine codon to a stop codon. so 

The pair of conformers observed in exon 15 of the DP23 
gene for FAP patient 3751 also was sequenced. These 
conformers were found to carry a nucleotide substitution of 
C to G at position 5253 , the third base of a valine codon. No 
amino add change resulted from this substitution, suggest- 55 
ing that this conformer reflects a genetically silent polymor- 
phism. 

The observation of distinct inactivating mutations in the 
DP2.5 gene in four unrelated patients strongly suggested 
that DP2.5 is the gene involved in FAR These mutations are 60 
summarized in Table HA. 

EXAMPLE 10 

This example demonstrates that the mutations identified 
in the DP2.5 (APQ gene segregate with the FAPphenotype. 65 

Patient 3746, described above as carrying an APC allele 
with a frameshift mutation, is an affected offspring of two 
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normal parents. Colonoscopy revealed no polyps in either 
parent nor among the patient's three siblings. 

DNA samples from both parents, from the patient's wife, 
and from their three children were examined SSCP analysis 

5 of DNA from both of the patient's parents displayed the 
normal pattern of conformers for exon 7, as did DNA from 
the patients's wife and one of his off-spring. The two other 
children, however, displayed the same new conformers as 
their affected father. Testing of the patient and his parents 

10 wimhigMypolymorprricVNTR (variable number of tandem 
repeat) markers showed a 99,98% likelihood that they are 
his biological parents. 

These observations confirmed that this novel conformer, 
known to reflect a 2 bp deletion mutation in the DP2.5 gene, 

15 appeared spontaneously with FAP in this pedigree and was 
transmitted to two of the children of the affected individual. 

EXAMPLE 11 

20 This example demonstrates polymorphisms in the APC 
gene which appear to be unrelated to disease (FAP). 

Sequencing of variant conformers found among controls 
as well as individuals with APC has revealed the following 
polymorphisms in the APC gene: first, in exon 11, at position 

25 1458, a substitution of T to C creating an Rsal restriction site 
but no amino acid change; and second in exon 15, at 
positions 5037 and 5271, substitutions of A to G and G to T, 
respectively, neither resulting in amino acid substitutions. 
These nucleotide polymorphisms in the APC gene sequence 

30 may be useful for diagnostic purposes. 

EXAMPLE 12 
This example shows the structure of the APC gene. 
The structure of the APC gene is schematically shown in 
35 FIG. 8, with flanking intron sequences indicated (SEQ ID 
NO:ll-38). 

The continuity of the very large (6.5 kb), most 3' exon in 
DP2.5 was shown in two ways. First, inverse PCR with 

^ primers spanning the entire length of mis exon revealed no 
divergence of the cDNA sequence from the genomic 
sequence. Second, PCR amplification with converging prim- 
ers placed at intervals along the exon generated products of 
the same size whether amplified from the originally isolated 

45 cDNA, cDNA from various tissues, or genomic template. 
Two forms of exon 9 were found in DP2.5: one is the 
complete exon; and the other, labeled exon 9A, is the result 
of a splice into the interior of the exon that deletes bases 934 
to 1236 in the mRNA and removes 101 amino adds from the 

^ predicted protein (see FIG. 3, SEQ ID NO:l and 2). 

EXAMPLE 13 

This example demonstrates the mapping of the FAP 
deletions with respect to the APC exons. 

55 Somatic cell hybrids carrying the segregated chromo- 
somes 5 from the 100 kb (HHW1291) and 260 kb 
(HHW1155) deletion patients were used to determine the 
distribution of the APC genes exons across the deletions. 
DNAs from these cell lines were used as template, along 

6o with genomic DNA from a normal control, for PCR-based 
amplification of the APC exons. 

PCR analysis of the hybrids from the 260 kb deletion of 
patient 3214 showed that all but one (exon 1) of the APC 
exons are removed by this deletion. PCR analysis of the 

65 somatic cell hybrid HHW1291, carrying me chromosome 5 
homolog with the 100 kb deletion from patient 3824, 
revealed that exons 1 through 9 are present but exons 10 
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through 15 are missing. This result placed the deletion 
breakpoint either between exons 9 and 10 or within exon 10. 

EXAMPLE 14 

This example demonstrates the expression of alternately 5 
spliced APC messenger in normal tissues and in cancer cell 
lines. 

Tissues that express the APC gene were identified by PCR 
amplification of cDNA made to mRNA with primers located 
within adjacent APC exons. In addition, PCR primers that 
flank the alternatively spliced exon 9 were chosen so that the 
expression pattern of both splice forms could be assessed. 
All tissue types tested (brain, lung, aorta, spleen, heart, 
kidney, liver, stomach, placenta, and colonic mucosa) and 
cultured cell lines (lymphoblasts, HL60, and 
choriocarcinoma) expressed both splice forms of the APC 
gene. We note, however, mat expression by lymphocytes 
normally residing in some tissues, including colon, prevents 
unequivocal assessment of expression. The large mRNA, 
containing the complete exon 9 rather man only exon 9A, 
appears to be the more abundant message. 

Northern analysis of poly(A)-selected RNA from lym- 
phoblasts revealed a single band of approximately 10 kb, 
consistent with the size of the sequenced cDNA. 

EXAMPLE 15 

This example discusses structural features of the APC 
protein predicted from the sequence. 

The cDNA consensus sequence of APC predicts that the 
longer, more abundant form of the message codes for a 2842 
or 2844 amino acid peptide with a mass of 311.8 kd. This 
predicted APC peptide was compared with the current data 
bases of protein and DNA sequences using bom Intellige- 
netics and GCG software packages. No genes with a high 
degree of amino arid sequence similarity were found. 
Although many short (approximately 20 amino arid) regions 
of sequence similarity were uncovered, none was suffi- 
ciently strong to reveal which, if any, might represent 
functional homology. Interestingly, multiple similarities to 
myosins and keratins did appear. The APC gene also was 
scanned for sequence motifs of known function; although 
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multiple glycosylation, phosphorylation, and myristoylation 
sites were seen, their significance is uncertain. 



5 Analysis of the APC peptide sequence did identify fea- 
tures important in considering potential protein structure. 
Hydropathy plots (Kyte and DooIMe, J. MoL Biol. Vol. 157, 
pp. 105-132 (1982)) indicate mat the APC protein is notably 
hydrophilic. No hydrophobic domains suggesting a signal 

10 peptide or a membrane-spanning domain were found. 
Analysis of the first 1000 residues indicates that a-helical 
rods may f arm (Cohen and Parry, Trends Biochem, Sci. Vol. 
77, pp. 245-248 (1986); there is a scarcity of proline 
residues and, there are a number of regions containing 
heptad repeats (apolar-X-X-apolar-X-X-X). Interestingly, in 
exon 9A, the deleted form of exon 9, two heptad repeat 
regions are reconnected in the proper heptad repeat frame, 
deleting the intervening peptide region. After the first 1000 

20 residues, the high proline content of the remainder of the 
peptide suggests a compact rather man a rod-like structure. 



The most prominent feature of the second 1000 residues 
is a 20 amino acid repeat that is iterated seven times with 
serniregular spacing (Table 4). The intervening sequences 
between the seven repeat regions contained 114, 116, 151, 
205, 107, and 58 amino acids, respectively. Finally, residues 
2200-24000 contain a 200 amino acid basic domain. 

TABLE IV 



Seven Different Versions of the 20- Amino Acid Repeat 

Consensus: F * VE * TP * CF S R * S SLS S L S 

1262: YCVEDTPICFSRCSSLSSLS 

1376: HTVQETPLMFSRCTSVSSLD 

1492: FATE STPDGFS CSS SLS ALS 

1643: YCVEGTPINF STATS LSDLT 

1848: TPIEGTP YCF S RND SLS SLD 

1953: FAIENTPVCPSHNSSLSSLS 

2013: RHVEDTPVCFSRNSSLSSLS 



Numbers denote the first amino acid of each repeat Ths consensus sequence 
at the top reflects a majority amino acid at a given position. 
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[57] ABSTRACT 

A human gene termed APC is disclosed. Methods and kits 
are provided for assessing mutations of the APC gene in 
human tissues and body samples. APC mutations are found 
in familial adenomatous polyposis patients as well as in 
sporadic colorectal cancer patients. APC is expressed in 
most normal tissues. These results suggest that APC is a 
tumor suppressor. 
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JOINT DECLARATION FOR REISSUE PATENT APPLICATION 

As the below named inventor, we hereby declare that: 

Our residence, post office address and citizenship are as stated below next to our names; 

We believe we are the original, first and joint inventors of the subject matter which is claimed and for which a patent is sought on the 
invention entitled APC ANTIBODIES 




. as Application Serial Number 08/452.654 



. and was amended on 



the specification of which 

□ is attached hereto. 

■ was filed on May 25. 1995 

(if applicable). 

We hereby state that we have reviewed and understand the contents of the above identified specification, including the claims, as 
amended by any amendment referred to above. 

We acknowledge the duty to disclose information which is material to patentability in accordance with Title 37, Code of Federal 
Regulations, § 1.56(a). 

Prior Foreign Application^) 

We hereby claim foreign priority benefits under Title 35, United States Code, §119 of any foreign application(s) for patent or inventor's 
certificate listed below and have also identified below any foreign application(s) for patent or inventor's certificate having a filing date before 
that of the application on which priority is claimed: 







A$p&fti$om 3$umber 




Dateofisjstte ; 


BSority Ckimed 








(day , monft, year) 


(day, month, year) 


tfader35U\S<C. §119 




United Kingdom 


9100962.1 


16701/91 




YES 




United Kingdom 


9100963.9 


16/01/91 




YES 




United Kingdom 


9100974.6 


16/01/91 




YES 




United Kingdom 


9100975.3 


16/01/91 




YES 



Prior United States Application^) 

We hereby claim the benefit under Title 35, United States Code, §120 of any United States application^) listed below and, insofar as the 
subject matter of each of the claims of this application is not disclosed in the prior United States application in the manner provided by the first 
paragraph of Title 35, United States Code, §112, we acknowledge the duty to disclose material information as defined in Title 37, Code of Federal 
Regulations, §1. 56(a) which occurred between the filing date of the prior application and the national or PCT international filing date of this 
application: 





Bfcte of 


Status * Patented J 
Peading, Abaadofled 















Prior United States Provisional Application^) 

We hereby claim priority benefits under Title 35, United States Code, §1 19(e) of any provisional application for patent listed below and 
have also identified below any provisional application for patent having a filing date before that of the application on which priority is claimed: 



;V : . : ;: Provisional Application Number 


Pate ofFiiing 
(day r fl*oatb, year) 


Priority Claimed 
i Under35U + S + C,§a9(e> 






















5,691,454, is wholly or partially inoperative or invalid because of the following defects in the 
specification: 

• the amino acid sequence provided for the APC protein in SEQ ID NO:7 of 
the sequence listing contains a minor error; and 

• the specification refers to overlapping APC cDNA clones as "defining an 
ORF of 2842 amino acids" (column 4, line 31) and as coding "for a 2842 
or 2844 amino acid peptide" (column 31, lines 32-33), rather than the 
correct number of 2843 amino acids, 

(2) The correction of SEQ ID NO:7 is supported by the specification. The missing 
proline at position 173 in SEQ ID NO:7 is supported in the specification by the proline which 
is present at position 173 in SEQ ID NOS:l and 2 and in Figure 3. In addition, routine analysis 
of YAC 37HG4 deposited as NCIMB 40353, referred to at column 12, lines 35-39 of U.S. Patent 
5,691,454 establishes that there is, indeed, a proline at codon 173. The deposit was made under 
the terms of the Budapest Treaty. (See declaration of Dr. Sarah Kagan, of record in Serial No. 
08/452,654, filed February 14, 1996.) One of ordinary skill in the art would have recognized the 
omission of the proline in SEQ ID NO:7 as a minor error by noting the inconsistency between 
the amino acid sequences presented in Figure 3 and in SEQ ID NOS:l and 2 with that in SEQ 
ID NO:7. 

(3) The error at column 4, line 31, referring to "an ORF of 2842 amino acids," 
occurred because of the inadvertent omission of the proline at position 173 in originally filed 
Figure 3 . The omission of this proline resulted in the APC protein being described in the 
specification as having 2842 rather than 2843 amino acids. 

(4) The error at column 31, lines 32-33, referring to a "2842 or 2844 amino acid 
peptide," occurred as follows. The application which issued as U.S. Patent 5,691,454 originally 
contained eight figures. In Figure 7 as originally filed, three supernumerary nucleotides were 
added at nucleotide positions 3972 (C), 3981 (G), and 3996 (A). As a result, the predicted amino 
acid sequence was erroneously stated to be "Ser Ser Val His Ser Thr Leu Glu" rather than "Ala 
Val Ser Gin His Pro Arg" at positions 1325 to 1331. This error resulted in an apparent sequence 
for the APC protein of 2844 amino acids. In combination with the omission of the proline at 
position 173 in originally filed Figure 3, this error resulted in the APC protein being described 
in the specification as a "2842 or 2844 amino acid peptide." Originally filed Figure 7 was 
canceled during prosecution of Serial No. 08/452,654, which issued as U.S. Patent 5,691,545, 

(5) Correction of the number of amino acids in the APC protein does not add new 
matter to the specification. It merely renders consistent the number of amino acids shown in 
SEQ ID NOS:l and 2 and the number of amino acids referred to in the specification. 

(6) All errors which are being corrected in the present reissue application up to the 
time of filing of this declaration arose without any deceptive intent on the part of the applicants. 

(7) We hereby declare that all statements made herein of our own knowledge are true 
and that all statements made on information and belief are believed to be true; and further that 
these statements were made with the knowledge that willful false statements and the like so made 
are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United 
States Code and that such willful false statements may jeopardize the validity of the application 
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And we hereby appoint, both jointly and severally, as our attorneys with full power of substitution and 
revocation, to prosecute this application and to transact all business in the Patent and Trademark Office 
connected herewith the following attorneys who are all members of the Bar of the District of Columbia, their 
registration numbers being listed after their names: 

Donald W. Banner, Registration No. 17,037; Edward F. McKie, Jr. . Registration No. 17,335; William 
W. Beckett, Registration No. 18,262; Dale H. Hoscheit, Registtation No. 19,090; Joseph M. Potenza, 
Registration No. 28,175; James A. Niegowsld, Registration No. 28,331; Joseph M. Skerpon, Registration 
No. 29,864; Thomas L. Peterson, Registration No. 30,969; Nina L. Medlock, Registration No. 29,673; 
William J. Fisher, Registration No. 32,133; Thomas H. Jackson, Registration No. 29,808; Patricia E. Hong, 
Registration No. 34,373; Robert S. Katz, Registration No. 36,402, Brian E. Hanlon, Registration No. 
40,449, Sarah A. Kagan, Registration No. 32,141 and Lisa M. Hemmendinger, Registration No. 42,653, 

All correspondence and telephone communications should be addressed to: Banner & Witcoff , Ltd. , 
Eleventh Floor, 1001 G Street, N.W., Washington, D.C. 20001-4597, telephone number (202) 508-9100, 
which is ateq the address and telephpnejoajmber of each of the above listed attorneys. 
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Resident 370Q Breton Wav. Baltimore. Maryland 21108 „ , , m _ m , 

Cirirjmahip United States of America . . 



Post Office 
Address Same as above 
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Full NamTS I J 



Second Invento r KTNZLER Kenneth . _ ^ _ W . 



Family Norne First Given Name Second Given Name 

R«fcfcnce 1403 Nslfctrk Wav. RcIAir. Maryland 21015 

Citiyjffltthip United States of Africa _ „ 

Post Office 

Address , SamfragafrftYe M „ - — 

Signature Date 

Full Nam© of 

Third Inventor ^LBERTSBN _ 

Family Name First Given Name Second Given Name 

Evidence 744 Northcrest Drive. Salt. Lake Citv, Utah 84103 , _ m u _ „ 

CMzenahitt Denmark _____ _ 

Post Office 

Address^, Same aft abvVft , — 



Signatur e _ Date__ 

Full Name of 

Fourth Inventor AN AND . Mtah ^ — %T 

Family Name First Given Name Second Given Name 
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Post Office 

Address^ Same as ahnve 
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And \uc hereby appoint, both jointly and severally, as our attorneys with full power of ^ubstitatzon and 
invocation, to prosecute this application and to transact all business in the Patent and Trateznarlk: Office 
connected herewith the following attorneys who ace all members of the Bar of the [District of Columbia, tfcar 
registration numbers being listed after their names: 

Donald W. Banner, Kcgiaixattoin No. 17,037; Edward F. McRic, Jr., Registration No. 17,335; William 
W. Beckett,. Registration No. 18,262; Dale EL HbscheLt, Itegistration No. 19,090; Joseph ML Potenza, 
Registration No. 2S,175; James A, Niegowaki, Registration No. 28,331; Joseph 3ML Stexpon, Registration 
No. 29,864; Thomas 1~ Peterson, Registration No, 30,969; NinaX*. Medloclc, Registration No. 29,673; 
William T. Fisher, Registration No. 32,133; Thomas H. Jackson, Registration No- 29,808; Patricia. IL Hong, 
Registration No. 34,373; Robert S. Kate, Registailioa No. 36,402, Brian E. Hanlon, Regifuocation No, 
40,449, Sarah A. Kagan, Registration No. 32,141 and LisaM, Hcmmenditiger, Registration No. 42,653. 

All correspondence and Telephone communications should be addressed to; Banna: Witcoff, Ltd., 
Eleventh Floor, 1001 G Street, N.W., Washington, D.C. 20001-4597, telephone number (202) 508-9100, 
" which is also the address and telephone number of each of the above listed attorneys. 
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Fciurtb Inventor AJS7 
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And we hereby appoint, both jointly and severally, as our attorneys with full power of substitution and 
revocation, to prosecute this application and to transact all business in the Patent and Trademark Office 
connected herewith fee following attorneys who are all members of the Bar of die District of Columbia, their 
registration numbers being listed after their names: 

Donald W, Banner, Registraticm No- 17,037; Edward F. McKie, Jr., Registration No. 17,335; William 
W. Beckett, Registration No. 18,262; Dale H. Hoscheit, Registration No. 19,090; Joseph M, Potenza, 
Registration No. 28,175; James A. Niegowski, Registration No. 28,331; Joseph M. Skeipon, Registration 
No. 29,864; Thomas L. Peterson, Registration No. 30,969; Nina L. Medlock, Registration No, 29,673; 
William J. Fisher, Registration No, 32,133; Thomas EL Jackson, Registration No, 29,808; Patricia E. Hong, 
Registration No. 34,373; Robert S. Katz, Registration No- 36,402, Brian E, Hanlon, Registration No. 
40,449, Sarah A. Kagan, Registration No. 32,141 and Lisa M. Hemmendinger, Registration No. 42,653. 

All correspondence and telephone communications should be addressed to: Banner & Witcoff, Ltd. , 
Seventh Floor, 1001 G Street, N.W„ Washington, D.C. 20001-4597, telephone number (202) 508-9100, 
which is also the address and telephone number of each of the above listed attorneys. 



Signature , Date. 

Full Name of 

First Invento r VOGELSTEI^ fcerJL 



Family Name Pint Given Name Second Given Name 
Residence 3700 Breton Wav. Baltimore Maryland 31208 



Citizenship United States of America 
Post Office 

AddM t Same aa above 



Date 



Full Name of 

Secongfavcntar RlNZLBR Kcnnem , W 

4^ Family Name First Given Name Second Given Name 

' Reak^ec 1403 Hftlkuk Way, EciAir, Maryland 2101? 

Citizeaahip United States of America . _ 

PofiC Office 

Address Same as above 



Signature... _ Dare 

Full Mime of 

Thirdlaveiitor ALBERTSEN Hsna 

^ Family Name First Given Name Second Given Name 

Residence 744 Norihcreat Drive. Salt Lake Citv. Utah &4103 

Citizenship. Denmark 

Post Office 

Address Same bs ahnve . 
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Fourth Inventor AN AND Bfc&sL- 



Family Name First Given Name Second Given Name 

Residence 62 Granre Wav. Sandhach. Cheshire CW1 1 9FS England 

Citizenshi p Brffi gfr _ ____ 



Post Office 

Addieas Same as above 
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Sooo|i3 Gttv-£<i ante 



Hull Nwiwk&I 



Family n*j*<> 



TPS* 
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Hudh Investor. 
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JDatei 



Tcrttiri Inv«mto«^ 



fj wag" 

******** j-jft^ Mum™™ Kivc^Tbk»o3Q4Tfct»h 



SqdqckI CKvuri Nam* 



Law ePPlc^a 
BaNWEH fit WYlco*r, LTD. 
1») C fTTWCT. RW, 



Signature 

Pull Name of 
Fifth Inventor 



CARLSON 



Family Name 



Mary 



First Given Name 



RMidftncc 1074 H Sunnvsidc Avenue. Salt Lake Citv. Utah 

GiHy^nahi p United Sta tes of America _ 

Post Office 

Address Same as above 



Date 



Second Given Name 



Signature.. 



Full Name of 
Sixth Inventor. 



GRQPEN 



Joanna 



Family Name 

P^id«i^ tt<* 9th Avenue. Salt Lake Citv. Utah S4103 



First Given Name 



Date. 



Second Given Name 



Citizenship United Stiffi vf Amgrigfl 



PostOffice 
Address Same as above 



Signatur e 
Full Name of 
Seventh Inventor. 



HEDGE 



■Philip 



Family Name First Given Name 
Residence 7. Rookerv Rise. Winaford. Cheshire CW7 3RA England 



Citizenship British 



Post Office 
Addggs Same aq above 



Date. 



Second Given Name 



SiMl&ireL 



Full^fame of 
Btgh|Jnvcntor. 



JOSLYN 



Family Name 

R^Sacc 426 7th Avenue. Salt Lake Citv. Utah R4-103 



fovff 

First Given Name 



Date, 



Second Given Name 



Cidzehehip_ 
Post "Office 

Address Same ag above 
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1 Name of 
Nin9rtnventor_ 



MARKHAM 



Alexander 



yp Family Name First Given Name 

Rea&ence 25. Booth Bed Lane. Gooarrev Crewe Cheshire. England 



Date 



Second Given Name 



Citizenshi p _ Bri^s|x _ 



Post Office 
Address Same as above 



Signature 

Full Name of 
Tenth Invcntor_ 



NAKVMVRA 



Residence. 



Family Name 
1^3-3 Matsuvama. Kiv oge_Tokyo 204 Japan 



VimiAt 



First Given Name 



Citizenshi p Japanese 



Pest Office 
Address Safflg fig afrftYg 



Daic_ 



Second Given Name 



law offices 
Banner & Wjtcoff, Ltd. 
1001 q street, n.w. 
washington, d.c 2qo01-46s7 

1202} 5W-9100 
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Full Name t>t 
Fifth Inventor^ 



CARLSON 



Family N*mc 



FSntt Crven Name 



Raaidflnca 207^ S. Sttnnvside Avenue. Sa.lt Lake CltV^ Utah. 



Second Given Name 



nifeinship Vjitetf States of America^ 
Address Saag uaftgyy,,. m „. 



ature_ 



Fall Name of 
Sixth Inventor. 



Residence^ 



First Giver* Nam* 



Swotxd Given Name 



fWyMWp t^njter) States of America 



Post Office 
Address* gamy as abovc__ 



Sigoator e — 
Seve&ti* Inventory 



pqg 

Family Name First Given Name 



Residence. 



atizensisip^JBofigli 

Pest Office 

Address — Sactt-ar rtav<u 



SjgJiaturc 
FuIiNameoT"' 
Eight Investor,. 



mm 

imily ?*am 



Rcfiidcr*c©_ 



Family 1 

jaoh A^saajp, Salt UkgJ^uJIl&ti 



Hwt Given Name 



Citizenship Vnit^l ^8t<^ ff,f Alg^aEB^. 



Post Office 

Address Sameas jjboye 



t Name of 
Ninth inventor. 



JK&£XHa*l 



Residence,,. 



Family Name Firel Given Name 



Citizgn&hip ^rjtkli 
Post Office 

Addreati.. Same as ab^vc 



Date. 



Second Given Name 



Second Given Name 




JEcesL 



Second Given Name 



Signature 

FuUN&msof 
Tenth Inventor 



Residence,. 



Family N&me 



First Given mine 



Post Office 
Address AnfcMJtoWfe. 



Second Given Name 



LAW OFFICES 

Banner & Wrrco*F, Ltd. 

1001 G StRE€?. N.W. 
WASHINGTON, D,C, 20001-4597 
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Full Name of 



Fifth Inventor, CARLSPfl - ^Mary 



Fa^yNamc First : Given Name — Second Gi-cn Name 
P^id ft nn« 2Q7A H. Sunnvsidc Avenue. Sail Lakg ClTY, Utab . 



r.in>.enship Uniteo* gtatcs of America 
Post Office 

aHHtt^^i Same as above . 



Pate 

Signature ____ ■ - 

Full Nam© of 

Sixth Inventor CjROPEN foan aiL 



Family Name First Given Name Second Given Name 
g^irtr-r,^ tttQ 9th Ave ^c- Salt Lakg City. Utah 34103 — 



Citizenship United STffltCfg of America 
Post Office 

AAArr>.** Sums a* above , 



Signature Datc - 

Full Name of _ 
Seventh Invento r HEDGE ^ JZhtilB. 



Family Name First Given Name Second Given Name 
Resident 7. Rookcrv -Rise. Yfinsford. Cheshire CW7 3EA England . 



C^gzenship _ British 



YmX Office 

MMmn Sam? a? aboy? 



Fuf Nameof 

E@it Inventor JOSL.VN Geoff 



Date, 



Family Name First Given Name Second Given Name 

Kgidenc e 4-26 7th Avenue. Salt Lake Citv. Utah S41Q3 , 

Oiri^cnship United States of America 

Host Office 

Aggress Same as above 



Signature ^ Date 

leHName of 

Mmth Invento r M aRKHaM Alexander, ErsaJ _ , 

yy Family Name First Given Name Second Given Name 

^Residence 2$, Bpojh Sed Lane. Goofitrev, Crewe. Cheshire. England 

Citizenshi p British 

Post Office 

Address Same as above ~ 

Signatur e l^t^X /7^<L „ Dat e / f f 

Full Name of J 

Tenth Invento r y NaKUMURA Yusdca 

Family Name First Given Name Second Given Name 

Residence 1-43-3 Matsuvama. Kivofie Tokvo 204- Japan 

Citizenshi p Japanese _ . 

Post Office 

Address Same as above 
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SEQUENCE LISTING 



( 1 ) GENERAL INFORMATION: 

( i i i ) NUMBER OF SEQUENCES: 102 



( 2 ) INFORMATION FOR SEQ ID NO:l: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 9606 base pairs 
(B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: double 
( D ) TOPOLOGY: Enear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

< A ) ORGANISM: Homo sapiens 

( v i i ) IMMEDIATE SOURCE: 

( B ) CLONE: DP23< APQ 

( i x ) FEATURE: 

( A ) NAME/KEY: CDS 



5,691,454 

33 34 

-continued 



( B ) LOCATION: 34.8562 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GGACTCGGAA ATGAGGTCCA AGGGTAGCCA AGG ATG GCT GCA GCT TCA TAT GAT 54 

Met Ala Ala Ala Sei Tyr Asp 

1 5 

CAG TTG TTA A AG CAA GTT GAG GCA CTG AAG ATG GAG AAC TCA A A T CTT 102 

Gin Leu Leu Lys Gin Val Glu Ala Leu Lys Met Glu Asn Ser Asn Leu 

10 15 2 0 

CGA CAA GAG CTA G A A GAT A A T TCC A AT CAT CTT ACA AAA CTG G A A ACT 150 

Arg Gin Glu Leu Glu Asp Asn Ser Asn His Leu Thr Lys Leu Glu Thr 

2 5 3 0 3 5 

GAG GCA TCT A AT ATG AAG G A A GTA CTT AAA CAA CTA CAA GGA AG T ATT 198 

Glu Ala Ser Asn Met Lys Glu Val Leu Lys Gin Leu Gin Gly Ser lie 

40 45 50 55 

G A A GAT G A A GCT ATG GCT TCT TCT GGA CAG ATT GAT TTA TTA GAG CGT 246 

Glu Asp Glu Ala Met Ala Ser Ser Gly Gin lie Asp Leu Leu Glu Arg 

6 0 6 5 7 0 

CTT AAA GAG CTT AAC TTA GAT AG C AG T A AT TTC CCT GGA GTA AAA CTG 294 

Leu Lys Glu Leu Asn Leu Asp Ser Scr Asn Phe Pro Gly Val Lys Leu 

7 5 8 0 8 5 

CGG TCA AAA ATG TCC CTC CGT TCT TAT GGA AGC CGG GAA GGA TCT GTA 342 

Arg Ser Lys Met Ser Leu Aig Ser Tyr Gly Ser Arg Glu Gly Ser Val 

9 0 9 5 10 0 

TCA AGC CGT TCT GGA GAG TGC AG T CCT GTT CCT ATG GGT TCA TTT CCA 390 

Ser Ser Arg Ser Gly Glu Cys Ser Pro Val Pro Met Gly Ser Phe Pro 

10 5 110 115 

AGA AGA GGG TTT GTA A A T GGA AGC AGA GAA AGT ACT GGA TAT TTA GAA 438 

Arg Arg Gly Phe Val Asn Gly Ser Arg Glu Ser Thr Gly Tyr Leu Glu 

120 125 130 135 

GAA CTT GAG AAA GAG AGG TCA TTG CTT CTT GCT GAT CTT G AC AAA GAA 486 

Glu Leu Glu Lys Glu Arg Ser Leu Leu Leu Ala Asp Leu Asp Lys Glu 

140 145 15 0 

GAA AAG GAA AAA GAC TGG TAT T AC GCT CAA CTT CAG AAT CTC ACT AAA 534 

Glu Lys Glu Lys Asp Tip Tyr Tyr Ala Gin Leu Gin Asn Leu Thr Lys 

15 5 16 0 16 5 

AGA ATA GAT AGT CTT CCT TTA ACT GAA AAT TTT TCC TTA CAA ACA GAT 5 82 

Arg lie Asp Ser Leu Pro Leu Thr Glu Asn Phe Ser Leu Gin Thr Asp 

17 0 17 5 18 0 

TTG ACC AGA AGG CAA TTG GAA TAT GAA GCA AGG CAA AT C AGA GTT GCG 630 

Leu Thr Arg Arg Gin Leu Glu Tyr Glu Ala Arg Gin lie Arg Val Ala 

18 5 19 0 19 5 

ATG GAA GAA CAA CTA GGT ACC TGC CAG GAT ATG GAA AAA CGA GCA CAG 678 

Met Glu Glu Gin Leu Gly Thr Cys Gin Asp Met Glu Lys Arg Ala Gin 

200 205 210 215 

CGA AGA ATA GCC AGA ATT CAG CAA AT C GAA AAG GAC ATA CTT CGT ATA 726 

Arg Arg lie Ala Arg lie Gin Gin lie Glu Lys Asp lie Leu Arg lie 

220 225 230 

CGA CAG CTT TTA CAG TCC CAA GCA ACA GAA GCA GAG AGG TCA TCT CAG 774 

Arg Gin Leu Leu Gin Ser Gin Ala Thr Glu Ala Glu Arg Ser Ser Gin 

235 240 245 

AAC AAG CAT GAA ACC GGC TCA CAT GAT GCT GAG CGG CAG AAT GAA GGT 822 

Asn Lys His Glu Thr Gly Ser His Asp Ala Glu Arg Gin Asn Glu Gly 

250 255 260 

CAA GGA GTG GGA GAA ATC AAC ATG GCA ACT TCT GGT AAT GGT CAG GGT 870 

Gin Gly Val Gly Glu lie Asn Met Ala Thr Ser Gly Asn Gly Gin Gly 

265 270 275 

TCA ACT ACA CGA ATG GAC CAT GAA ACA GCC AGT GTT TTG AGT TCT AGT 918 

Ser Thr Thr Arg Met Asp His Glu Thr Ala Ser Val Leu Ser Ser Ser 

280 285 290 295 
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AG C A C A CAC TCT GCA CCT CGA AGG CTG ACA AGT CAT CTG GGA ACC A AG 966 

Ser Thr His Ser Ala Pro Arg Arg Leu Thr Ser His Leu Gly Thr Ly s 

300 305 310 

GTG G A A AT G GTG TAT TCA TTG TTG TCA AT G CTT GGT ACT CAT GAT A AG 1014 

Val Glu Met Val Tyi Ser Leu Leu Ser Met Leu Gly Thr His Asp Lys 

315 320 325 

GAT GAT AT G TCG CGA ACT TTG CTA GCT AT G TCT AG C TCC CAA GAC AGC 1062 

Asp Asp Met Ser Arg Tbr Leu Leu Ala Met Ser Ser Ser Gin Asp Ser 

330 335 340 

TGT ATA TCC ATG CGA CAG TCT GGA TGT CTT CCT CTC CTC ATC CAG CTT 1110 

Cys lie Ser Met Arg Gin Ser Gly Cys Leu Pro Leu Leu lie Gin Leu 

345 350 355 

TTA CAT GGC A AT GAC AAA GAC TCT GTA TTG TTG GGA A AT TCC CGG GGC 1158 

Leu His Gly Asn Asp Lys Asp Ser Val Leu Leu Gly Asn Ser Arg Gly 

360 365 370 375 

AGT AAA GAG GCT CGG GCC AGG GCC AGT GCA GCA CTC CAC AAC ATC ATT 1206 

Ser Lys Glu Ala Arg Ala Arg Ala Ser Ala Ala Leu His Asn lie lie 

380 385 390 

CAC TCA CAG CCT GAT GAC A AG AG A GGC AGG CGT GAA ATC CGA GTC CTT 1254 

His Ser Gin Pro Asp Asp Lys Arg Gly Arg Arg Glu lie Arg Val Leu 

395 400 405 

CAT CTT TTG GAA CAG ATA CGC GCT TAC TGT GAA ACC TGT TGG GAG TGG 1302 

His Leu Leu Glu Gin lie Aig Ala Tyr Cys Glu Thr Cys Trp Glu Trp 

410 415 420 

CAG GAA GCT CAT GAA CCA GGC ATG GAC CAG GAC AAA A A T CCA ATG CCA 135 0 

Gin Glu Ala His Glu Pro Gly Met Asp Gin Asp Lys Asn Pro Met Pro 

425 430 435 

GCT CCT GTT GAA CAT CAG ATC TGT CCT GCT GTG TGT GTT CTA ATG AAA 1398 

Ala Pro Val Glu His Gin lie Cys Pro Ala Val Cys Val Leu Met Lys 

440 445 450 455 

CTT TCA TTT GAT GAA GAG CAT AGA CAT GCA ATG A AT GAA CTA GGG GGA 1446 

Leu Ser Phe Asp Glu Glu His Arg His Ala Met Asn Glu Leu Gly Gly 

460 465 470 

CTA CAG GCC ATT GCA GAA TTA TTG CAA GTG GAC TGT GAA ATG TAT GGG 1494 

Leu Gin Ala lie Ala Glu Leu Leu Gin Val Asp Cys Glu Met Tyr Gly 

475 480 485 

CTT ACT A AT GAC CAC TAC AGT ATT ACA CTA AGA CGA TAT GCT GGA ATG 1542 

Leu Thr Asn Asp His Tyr Ser lie Thr Leu Arg Arg Tyr Ala Gly Met 

490 495 500 

GCT TTG ACA AAC TTG ACT TTT GGA GAT GTA GCC AAC A AG GCT ACG CTA 1590 

Ala Leu Thr Asn Leu Thr Phe Gly Asp Val Ala Asn Lys Ala Thr Leu 

5 0 5 5 10 5 15 

TGC TCT ATG AAA GGC TGC ATG AGA GCA CTT GTG GCC CAA CTA AAA TCT 1638 

Cys Ser Met Lys Gly Cys Met Arg Ala Leu Val Ala Gin Leu Lys Ser 

520 525 530 535 

GAA AGT GAA GAC TTA CAG CAG GTT ATT GCA AGT GTT TTG AGG A A T TTG 16 86 

Glu Ser Glu Asp Leu Gin Gin Val lie Ala Ser Val Leu Arg Asn Leu 

540 545 550 

TCT TGG CGA GCA GAT GTA A A T AGT AAA AAG ACG TTG CGA GAA GTT GGA 1734 

Ser Trp Arg Ala Asp Val Asn Ser Lys Lys Thr Leu Arg Glu Val Gly 

555 560 565 

AGT GTG AAA GCA TTG ATG GAA TGT GCT TTA GAA GTT AAA AAG GAA TCA 1782 

Ser Val Lys Ala Leu Met Glu Cys Ala Leu Glu Val Lys Lys Glu Ser 

570 575 580 

ACC CTC AAA AGC GTA TTG AGT GCC TTA TGG A AT TTG TCA GCA CAT TGC 1830 

Thr Leu Lys Ser Val Leu Ser Ala Leu Trp Asn Leu Ser Ala His Cys 

585 590 595 

ACT GAG A AT AAA GCT GAT ATA TGT GCT GTA GAT GGT GCA CTT GCA TTT 1878 

Thr Glu Asn Lys Ala Asp lie Cys Ala Val Asp Gly Ala Leu Ala Phe 

600 605 610 615 
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TTG GTT GGC 
Leu Va 1 G 1 y 



ATT G A A AGT 
lie G 1 u S e r 



ACA A AT GAG 

Thr Asn Glu 

6 5 0 

ACT TTA TTA 

Thr Leu Leu 
6 6 5 

GCA TGT GGA 

Ala Cys Gly 
6 8 0 

G A A GCA TTA 

Glu Ala Leu 



CAT TCA A AG 
His Ser Lys 



A AT CTC AT G 

Asn Leu Met 

7 3 0 

TCT CCT GGC 

Ser Pro Gly 
7 4 5 

CTA GAA GCA 

Leu Glu Ala 
7 6 0 

ATA GAC A AT 

lie Asp Asn 



AAG CAA AGT 

Lys Gin Ser 

GAT A AT AGG 

Asp Am Arg 



CCA TAT TTG 

Pro Tyr Leu 
8 2 5 

AGC TTA GAT 

Ser Leu Asp 
8 4 0 

CGC GGA ATT 

Arg Gly lie 



ACT TCT TCA 
Thi Ser Ser 



GCC AAA GTC 

Ala Lys Va 1 
8 9 0 

AGA AGT TCT 

Arg Ser Ser 
9 0 5 

A A T GCA CTT 

Asn Ala Leu 
9 2 0 



ACT CTT ACT 

Thr Leu Thr 
6 2 0 

GGA GGT GGG 

Gly Gly Gly 

6 3 5 

GAC CAC AGG 

Asp His Arg 



CAA CAC TTA 
Gin His Leu 



ACT TTG TGG 

Thr Leu Trp 
6 8 5 

TGG GAC ATG 

Trp Asp Met 
7 0 0 

CAC AAA ATG 

His Lys Met 
7 1 5 

GCA AAT AGG 

Ala Asn Arg 



TCA AGC TTG 
Ser Ser Leu 



GAA TTA GAT 

Glu Leu Asp 

7 6 5 

TTA AGT CCC 

Leu Ser Pro 
7 8 0 

CTC TAT GGT 

Leu Tyr Gly 
7 9 5 

TCA GAC AAT 

Ser Asp Asn 



AAT ACT ACA 
Asn Thr Thr 



AGT TCT CGT 

Ser Ser Arg 
8 45 

GGT CTA GGC 

Gly Leu Gly 
8 6 0 

AAG CGA GGT 

Lyi Arg Gly 
8 7 5 

ATG GAA GAA 

Met Glu Glu 



GGG TCT ACC 
Gly Ser Thr 



AGA AGA AGC 
Arg Arg Set 

9 2 5 



TAC CGG AGC 

Tyr Arg Ser 

ATA TTA CGG 

lie Leu Arg 

6 4 0 

CAA ATC CTA 

Gin lie Leu 
6 5 5 

AAA TCT CAT 

Lys Ser His 
6 7 0 

AAT CTC TCA 

Asn Leu Ser 



GGG GCA GTT 
Gly Ala Va 1 



ATT GCT ATG 

lie Ala Met 
7 2 0 

CCT GCG AAG 

Pro Ala Lys 

7 3 5 

CCA TCT CTT 

Pro Ser Leu 
7 5 0 

GCT CAG CAC 

Ala Gin His 



AAG GCA TCT 
Lys Ala Ser 



GAT TAT GTT 

Asp Tyr Va 1 
8 0 0 

TTT AAT ACT 

Phe Asn Thr 
8 1 5 

GTG TTA CCC 

Val Leu Pro 
8 3 0 

TCT GAA AAA 

Ser Glu Lys 



AAC TAC CAT 
Asn Tyr His 



TTG CAG ATC 

Leu Gin lie 
8 8 0 

GTG TCA GCC 

Val Ser Ala 
8 9 5 

ACT GAA TTA 

Thr Glu Leu 
9 1 0 

TCT GCT GCC 

Ser Ala Ala 



CAG ACA AAC 

Gin Thr Asn 

6 2 5 

AAT GTG TCC 

Asn Val Ser 



AGA GAG AAC 
Arg Glu Asn 



AGT TTG ACA 

Ser Leu Thr 

6 7 5 

GCA AGA AAT 

Ala Arg Asn 
6 9 0 

AGC ATG CTC 

Ser Met Leu 
70 5 

GGA AGT GCT 

Gly Ser Ala 



TAC AAG GAT 
Tyr Lys Asp 



CAT GTT AGG 

His Val Arg 
7 5 5 

TTA TCA GAA 

Leu Ser Glu 
7 7 0 

CAT CGT AGT 

His Arg Ser 
7 8 5 

TTT GAC ACC 

Phe Asp Thr 



GGC AAC ATG 
Gly Asn Met 



AGC TCC TCT 

Ser Ser Ser 
8 3 5 

GAT AGA AGT 

Asp Arg Ser 
8 5 0 

CCA GCA ACA 

Pro Ala Thr 
8 6 5 

TCC ACC ACT 

Ser Thr Thr 



ATT CAT ACC 
lie His Thr 



CAT TGT GTG 
His Cys Val 
9 1 5 

CAT ACA CAT 
His Thr His 
9 3 0 



ACT TTA GCC 

Thr Leu Ala 
6 3 0 

AGC TTG ATA 

Ser Leu lie 

6 4 5 

AAC TGT CTA 

Asn Cys Leu 
6 6 0 

ATA GTC AGT 

lie Val Ser 



CCT AAA GAC 
Pro Lys Asp 



AAG AAC CTC 

Lys Asn Leu 
7 1 0 

GCA GCT TTA 

Ala Ala Leu 

7 2 5 

GCC AAT ATT 

Ala Asn lie 
7 4 0 

AAA CAA AAA 

Lys Gin Lys 



ACT TTT GAC 
Thr Phe Asp 



AAG CAG AGA 

Lys Gin Arg 
7 9 0 

AAT CGA CAT 

Asn Arg His 

8 0 5 

ACT GTC CTT 

Thr Val Leu 

8 2 0 

TCA TCA AGA 

Sex Ser Arg 



TTG GAG AGA 
Leu Glu Arg 



GAA AAT CCA 

Glu Asn Pro 
8 7 0 

GCA GCC CAG 

Ala A 1 & Gin 
8 8 5 

TCT CAG GAA 

Ser Gin Glu 
9 0 0 

ACA GAT GAG 

Thr Asp Glu 



TCA AAC ACT 
Ser Asn Thr 



ATT 19 2 6 

I 1 e 



GCT 19 7 4 

A 1 a 



CAA 2 0 2 2 

G 1 n 



AAT 2 0 7 0 

Asn 



CAG 2 118 

G 1 n 
6 9 5 

ATT 2 16 6 

I 1 e 



AGG 2 2 14 

Ar g 



ATG 2262 
Me t 



GCC 2 3 10 

Al a 

AAT 23 5 8 

Asn 

7 7 5 

CAC 2406 
Hi s 



GAT 245 4 

Asp 



TCA 2 5 0 2 

Ser 



GGA 2 5 5 0 

G 1 y 



GAA 2598 
G 1 u 
8 5 5 

GGA 2 646 

G 1 y 



ATT 2 6 9 4 

I 1 e 



GAC 2 742 

Asp 



AGA 2790 
Arg 



TAC 2 8 3 8 

Tyr 

9 3 5 
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A AT TTC ACT A AG TCG G A A A AT TCA A A T AGG ACA TGT TCT ATG CCT TAT 2 8 8 6 

Asn Phc Thr Lys Scr Glu Asn Set Asa Arg Tfar Cys Set Met Pro Tyr 
940 945 950 

GCC AAA TTA G A A TAC A AG AGA TCT TCA A AT GAT AG T TTA AAT AGT GTC 2 9 3 4 

Ala Lys Leu Glu Tyr Lys Arg Set Ser Asn Asp Scr Leu Asn Ser Val 

955 960 965 

AGT AGT AAT GAT GGT TAT GGT AAA AGA GGT CAA ATG AAA CCC TCG ATT 2982 
Set Ser Asn Asp Gly Tyr Gly Lys Arg Gly Gin Met Lys Pro Ser lie 
970 975 980 

GAA TCC TAT TCT G A A GAT GAT G A A AGT A AG TTT TGC AGT TAT GGT CAA 3 0 3 0 

Glu Ser Tyr Ser Glu Asp Asp Glu Ser Lys Phe Cys Ser Tyr Gly Gin 
985 990 995 

TAC CCA GCC G AC CTA GCC CAT AAA ATA CAT AGT GCA AAT CAT ATG GAT 3078 
Tyr Pro Ala Asp Leu Ala His Lys lie His Ser Ala Asn His Met Asp 
1000 1005 1010 1015 

GAT AAT GAT GGA GAA CTA GAT ACA CCA ATA AAT TAT AGT CTT AAA TAT 3 126 

Asp Asn Asp Gly Glu Leu Asp Thr Pro lie Asn Tyr Ser Leu Lys Tyr 
1020 1025 1030 

TCA GAT GAG CAG TTG AAC TCT GGA AGG CAA AGT CCT TCA C AG AAT GAA 3 174 

Ser Asp Glu Gin Leu Asn Ser Gly Arg Gin Ser Pro Ser Gin Asn Glu 
1035 1040 1045 

AGA TGG GCA AGA CCC AAA CAC ATA ATA GAA GAT GAA ATA AAA CAA AGT 3 222 

Arg Trp Ala Arg Pro Lys His lie lie Glu Asp Glu lie Lys Gin Ser 
1050 1055 1060 

GAG CAA AGA CAA TCA AGG AAT CAA AGT ACA ACT TAT CCT GTT TAT ACT 3270 
Glu Gin Arg Gin Ser Arg Asn Gin Ser Thr Thr Tyr Pro Val Tyr Thr 
1065 1070 1075 

GAG AG C ACT GAT GAT AAA CAC CTC A AG TTC CAA CCA CAT TTT GGA CAG 33 18 

Glu Ser Thr Asp Asp Lys His Leu Lys Phe Gin Pro His Phe Gly Gin 
1080 1085 1090 1095 

CAG GAA TGT GTT TCT CCA TAC AGG TCA CGG GGA GCC AAT GGT TCA GAA 3 3 66 

Gin Glu Cys Val Ser Pro Tyr Arg Ser Arg Gly Ala Asn Gly Ser Glu 
1100 1105 1110 

ACA AAT CGA GTG GGT TCT AAT CAT GGA ATT AAT CAA AAT OTA AGC CAG 3414 
Tht Asn Arg Val Gly Ser Asn His Gly lie Asn Gin Asn Val Ser Gin 
1115 1120 1125 

TCT TTG TGT CAA GAA GAT G AC TAT GAA GAT GAT AAG CCT ACC AAT TAT 3462 
Ser Leu Cys Gin Glu Asp Asp Tyr Glu Asp Asp Lys Pro Thr Asn Tyr 
1130 1135 1140 

AGT GAA CGT TAC TCT GAA GAA GAA CAG CAT GAA GAA GAA GAG AGA CCA 35 10 

Ser Glu Arg Tyr Ser Glu Glu Glu Gin His Glu Glu Glu Glu Arg Pro 
1145 1150 1155 

ACA AAT TAT AGC ATA AAA TAT AAT GAA GAG AAA CGT CAT GTG GAT CAG 3558 
Thr Asn Tyr Ser lie Lys Tyr Asn Glu Glu Lys Arg His Val Asp Gin 
1160 1165 1170 1175 

CCT ATT GAT TAT AGT TTA AAA TAT GCC ACA GAT ATT CCT TCA TCA CAG 3606 
Pro lie Asp Tyr Ser Leu Lys Tyr Ala Thr Asp lie Pro Ser Ser Gin 
1180 1185 1190 

AAA CAG TCA TTT TCA TTC TCA AAG AGT TCA TCT GGA CAA AGC AGT AAA 3654 
Lys Gin Ser Phe Ser Phe Ser Lys Ser Ser Ser Gly Gin Ser Ser Lys 
1195 1200 1205 

ACC GAA CAT ATG TCT TCA AGC AGT GAG AAT AC G TCC ACA CCT TCA TCT 3 7 0 2 

Thr Glu His Met Ser Ser Ser Ser Glu Asn Thr Ser Thr Pro Ser Ser 
1210 1215 1220 

AAT GCC AAG AGG CAG AAT CAG CTC CAT CCA AGT TCT GCA CAG AGT AGA 3750 
Asn Ala Lys Arg Gin Asn Gin Leu His Pro Ser Ser Ala Gin Ser Arg 
1225 1230 1235 

AGT GGT CAG CCT CAA AAG OCT GCC ACT TGC AAA GTT TCT TCT ATT AAC 3798 
Ser Gly Gin Pro Gin Lys Ala Ala Thr Cys Lys Val Ser Ser lie Asn 
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C A A G A A ACA ATA C AG ACT TAT TGT GTA G A A GAT ACT CCA ATA TGT TTT 3 8 4 6 

Gin Glu Thr lie Gin Thr Tyr Cys Val Glu Asp Thi Pro lie Cys Phe 

1260 1265 1270 

TCA AGA TGT AG T TCA TTA TCA TCT TTG TCA TCA GCT G A A GAT G A A ATA 3 8 9 4 

Ser Arg Cys Ser Ser Leu Ser Ser Leu Ser Ser Ala Glu Asp Glu lie 

1275 1280 1285 

GGA TGT A AT C AG AC G ACA CAG G A A GCA GAT TCT GCT A A T ACC CTG CAA 3 9 4 2 

Gly Cys Asu Gin Thr Thr Gin Glu Ala Asp Ser Ala Asn Thr Leu Gin 

1290 1295 1300 

ATA GCA G A A ATA AAA GGA A AG ATT GGA ACT AGG TCA GCT G A A GAT CCT 3 9 9 0 

lie Ala Glu lie Lys Gly Lys lie Gly Thr Arg Ser Ala Glu Asp Pro 

1305 1310 1315 

GTG AG C G A A GTT CCA GCA GTG TCA CAG CAC CCT AGA ACC AAA TCC AG C 4 0 3 8 

Val Ser Glu Val Pro Ala Val Ser Gin His Pro Arg Thr Lys Ser Ser 

1320 1325 1330 1335 

AGA CTG CAG GOT TCT AGT TTA TCT TCA G A A TCA GCC AGG CAC AAA GCT 4086 

Arg Leu Gin Gly Ser Ser Leu Ser Ser Glu Ser Ala Arg His Lys Ala 

1340 1345 1350 

GTT G A A TTT CCT TCA GGA GCG AAA TCT CCC TCC AAA AGT GGT GCT CAG 4134 

Val Glu Phe Pro Ser Gly Ala Lys Ser Pro Ser Lys Ser Gly Ala Gin 

1355 1360 1365 

ACA CCC AAA AGT CCA CCT G A A CAC TAT GTT CAG GAG ACC CCA CTC ATG 4182 

Thr Pro Lys Ser Pro Pro Glu Hi* Tyr Val Gin Glu Thr Pro Leu Met 

1370 1375 1380 

TTT AGC AGA TGT ACT TCT GTC AGT TCA CTT GAT AGT TTT GAG AGT CGT 4230 

Phe Ser Arg Cys Thr Ser Val Ser Ser Leu Asp Ser Phe Glu Ser Arg 

1385 1390 1395 

TCG ATT GCC AGC TCC GTT CAG AGT G A A CCA TGC AGT GGA ATG GTA AGT 4278 

Ser lie Ala Ser Ser Val Gin Ser Glu Pro Cys Ser Gly Met Val Ser 

1400 1405 1410 1415 

GGC ATT ATA AGC CCC AGT GAT CTT CCA GAT AGC CCT GGA CAA ACC ATG 4326 

Gly lie lie Ser Pro Ser Asp Leu Pro Asp Ser Pro Gly Gin Thr Met 

1420 1425 1430 

CCA CCA AGC AGA AGT AAA ACA CCT CCA CCA CCT CCT CAA ACA GCT CAA 4374 

Pro Pro Ser Arg Ser Lys Thr Pro Pro Pro Pro Pro Gin Thr Ala Gin 

1435 1440 1445 

ACC A AG CGA GAA GTA CCT AAA AAT AAA GCA CCT ACT GCT GAA AAG AGA 4422 

Thr Lys Arg Glu Val Pro Lys Asn Lys Ala Pro Thr Ala Glu Lys Arg 

1450 1455 1460 

GAG AGT GGA CCT AAG CAA GCT GCA GTA AAT GCT GCA GTT CAG AGG GTC 4470 

Glu Ser Gly Pro Lys Gin Ala Ala Val Asn Ala Ala Val Gin Arg Val 

1465 1470 1475 

CAG GTT CTT CCA GAT GCT GAT ACT TTA TTA CAT TTT GCC ACA GAA AGT 4518 

Gin Val Leu Pro Asp Ala Asp Thr Leu Leu His Phe Ala Thr Glu Ser 

1480 1485 1490 1495 

ACT CCA GAT GGA TTT TCT TGT TCA TCC AGC CTG AGT GCT CTG AGC CTC 4566 

Thr Pro Asp Gly Phe Ser Cys Ser Ser Ser Leu Ser Ala Leu Ser Leu 

1500 1505 1510 

GAT GAG CCA TTT ATA CAG AAA GAT GTG GAA TTA AGA ATA ATG CCT CCA 46 14 

Asp Glu Pro Phe lie Gin Lys Asp Val Glu Leu Arg lie Met Pro Pro 

1515 1520 1525 

GTT CAG GAA AAT G AC AAT GGG AAT GAA ACA GAA TCA GAG CAG CCT AAA 4662 

Val Gin Glu Asn Asp Asn Gly Asn Glu Thr Glu Ser Glu Gin Pro Lys 

1530 1535 1540 

GAA TCA AAT GAA A AC CAA GAG AAA GAG GCA GAA AAA ACT ATT GAT TCT 47 10 

Glu Ser Asn Glu Asn Gin Glu Lys Glu Ala Glu Lys Thr lie Asp Ser 

1545 1550 1555 

GAA AAG G AC CTA TTA GAT GAT TCA GAT GAT GAT GAT ATT GAA ATA CTA 475 8 

Glu Lys Asp Leu Leu Asp Asp Ser Asp Asp Asp Asp lie Glu lie Leu 

1560 1565 1570 1575 
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G A A G A A TGT ATT ATT TCT GCC ATG CCA ACA A A G TCA TCA CGT AAA GGC 4 8 0 6 

Glu Glu Cys lie lie Ser Ala Met Pro Thr Lys Ser Ser Arg Lys Gly 

1580 1585 1590 

AAA A AG CCA GCC C AG ACT GCT TCA AAA TTA CCT CCA CCT GTG GCA AGG 4 8 5 4 

Lys Lys Pro Ala Gin Thr Ala Ser Lys Leu Pro Pro Pro Val Ala Arg 

1595 1600 1605 

AAA CCA AGT C AG CTG CCT GTG TAC AAA CTT CTA CCA TCA CAA AAC AGG 4 9 0 2 

Lys Pro Ser Gin Leu Pro Val Tyr Lys Leu Leu Pro Ser Gin Asn Arg 

1610 1615 1620 

TTG CAA CCC CAA A AG CAT GTT AGT TTT ACA CCG GGG GAT GAT ATG CCA 4950 

Leu Gin Pro Gin Lys His Val Ser Phe Thr Pro Gly Asp Asp Met Pro 

1625 1630 1635 

CGG GTG TAT TGT GTT G A A GGG ACA CCT ATA AAC TTT TCC ACA GCT ACA 4998 

Arg Val Tyr Cys Val Glu Gly Thr Pro lie Asn Phe Ser Thr Ala Thr 

1640 1645 1650 1655 

TCT CTA AGT GAT CTA ACA AT C GAA TCC CCT CCA A A T GAG TTA GCT GCT 5 0 4 6 

Ser Leu Ser Asp Leu Thr lie Glu Ser Pro Pro Asn Glu Leu Ala Ala 

1660 1665 1670 

GGA GAA GGA GTT AG A GGA GGA GCA CAG TCA GGT GAA TTT GAA AAA CGA 5094 

Gly Glu Gly Val Arg Gly Gly Ala Gin Ser Gly Glu Phe Glu Lys Arg 

1675 1680 1685 

GAT ACC ATT CCT ACA GAA GGC AGA AGT ACA GAT GAG GCT CAA GGA GGA 5 142 

Asp Thr lie Pro Thr Glu Gly Arg Ser Thr Asp Glu Ala Gin Gly Gly 

1690 1695 1700 

AAA ACC TCA TCT GTA ACC ATA CCT GAA TTG GAT GAC AAT AAA GCA GAG 5 190 

Lys Thr Ser Ser Val Thr lie Pro Glu Leu Asp Asp Asn Lys Ala Glu 

1705 1710 1715 

GAA GGT GAT ATT CTT GCA GAA TGC ATT AAT TCT GCT ATG CCC AAA GGG 5238 

Glu Gly Asp lie Leu Ala Glu Cys lie Asn Ser Ala Met Pro Lys Gly 

1720 1725 1730 1735 

AAA AGT CAC A AG CCT TTC CGT GTG AAA AAG ATA ATG GAC CAG GTC CAG 5286 

Lys Ser His Lys Pro Phe Arg Val Lys Lys lie Met Asp Gin Val Gin 

1740 1745 1750 

CAA GCA TCT GCG TCG TCT TCT GCA CCC AAC AAA AAT CAG TTA GAT GGT 5334 

Gin Ala Ser Ala Ser Ser Ser Ala Pro Asn Lys Asn Gin Leu Asp Gly 

1755 1760 1765 

AAG AAA AAG AAA CCA ACT TCA CCA GTA AAA CCT ATA CCA CAA AAT ACT 53 82 

Lys Lys Lys Lys Pro Thr Ser Pro Val Lys Pro lie Pro Gin Asn Thr 

1770 1775 1780 

GAA TAT AGG ACA CGT GTA AGA AAA AAT GCA GAC TCA AAA AAT AAT TTA 5430 

Glu Tyr Arg Thr Aig Val Arg Lys Asn Ala Asp Ser Lys Asn Asn Leu 

1785 1790 1795 

AAT GCT GAG AGA GTT TTC TCA GAC AAC AAA GAT TCA AAG AAA CAG AAT 5478 

Asn Ala Glu Arg Val Phe Ser Asp Asn Lys Asp Ser Lys Lys Gin Asn 

1800 1805 1810 1815 

TTG AAA AAT AAT TCC AAG GAC TTC AAT GAT AAG CTC CCA AAT AAT GAA 5526 

Leu Lys Asn Asn Ser Lys Asp Phe Asn Asp Lys Leu Pro Asn Asn Glu 

1820 1825 1830 

GAT AGA GTC AGA GGA AGT TTT GCT TTT GAT TCA CCT CAT CAT TAC ACG 5574 

Asp Arg Val Arg Gly Sex Phe Ala Phe Asp Ser Pro His His Tyi Thr 

1835 1840 1845 

CCT ATT GAA GGA ACT CCT TAC TGT TTT TCA CGA AAT GAT TCT TTG AGT 5622 

Pro lie Glu Gly Thr Pro Tyr Cys Phe Ser Arg Asn Asp Ser Leu Ser 

1850 1855 1860 

TCT CTA GAT TTT GAT GAT GAT GAT GTT GAC CTT TCC AGG GAA AAG GCT 5670 

Ser Leu Asp Phe Asp Asp Asp Asp Val Asp Leu Ser Arg Glu Lys Ala 

1865 1870 1875 

GAA TTA AGA AAG GCA AAA GAA AAT AAG GAA TCA GAG GCT AAA GTT ACC 5718 

Glu Leu Arg Lys Ala Lys Glu Asn Lys Glu Ser Glu Ala Lys Val Thr 
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AGC CAC ACA G A A CTA ACC TCC AAC CAA CAA TCA GCT A A T A AG ACA CAA 5 7 6 6 

Scr His Thr Glu Leu Thr Ser Asn Gin Gin Scr Ala Asn Lys Thr Gin 

1900 1905 1910 

GCT ATT GCA A AG CAG CCA ATA A AT CGA GGT C AG CCT AAA CCC ATA CTT 58 14 

Ala lie Ala Lys Gin Pro lie Asn Aig Gly Gin Pro Lys Pro lie Leu 

1915 1920 1925 

CAG AAA CAA TCC ACT TTT CCC CAG TCA TCC AAA GAC ATA CCA G AC AGA 5862 

Gin Lys Gin Ser Thr Phe Pro Gin Ser Ser Lys Asp lie Pro Asp Arg 

1930 1935 1940 

GGG GCA GCA ACT GAT G A A A AG TTA CAG A AT TTT GCT ATT G A A A A T ACT 5910 

Gly Ala Ala Thr Asp Glu Lys Leu Gin Asn Phe Ala lie Glu Asn Thr 

1945 1950 1955 

CCA GTT TGC TTT TCT CAT A A T TCC TCT CTG AG T TCT CTC AGT GAC ATT 5 9 5 8 

Pro Val Cys Phe Ser His Asn Ser Ser Leu Ser Ser Leu Ser Asp lie 

1960 1965 1970 1975 

GAC CAA G A A AAC AAC A A T AAA G A A A A T G A A CCT AT C AAA GAG ACT GAG 6 0 0 6 

Asp Gin Glu Asn Asn Asn Lys Glu Asn Glu Pro lie Lys Glu Thr Glu 

1980 1985 1990 

CCC CCT GAC TCA CAG GGA G A A CCA AGT AAA CCT CAA GCA TCA GGC TAT 6054 

Pro Pro Asp Ser Gin Gly Glu Pro Ser Lys Pro Gin Ala Ser Gly Tyr 

1995 2000 2005 

GCT CCT AAA TCA TTT CAT GTT G A A GAT ACC CCA GTT TGT TTC TCA AGA 6 102 

Ala Pro Lys Ser Phe His Val Glu Asp Thr Pro Val Cys Phe Ser Arg 

2010 2015 2020 

AAC AGT TCT CTC AGT TCT CTT AGT ATT GAC TCT G A A GAT GAC CTG TTG 6 150 

Asn Ser Ser Leu Ser Ser Leu Ser lie Asp Ser Glu Asp Asp Leu Leu 

2025 2030 2035 

CAG G A A TGT ATA AGC TCC GCA AT G CCA AAA A AG AAA AAG CCT TCA AGA 6 198 

Gin Glu Cys lie Ser Ser Ala Met Pro Lys Lys Lys Lys Pro Ser Arg 

2040 2045 2050 2055 

CTC AAG GGT GAT A AT G A A AAA CAT AGT CCC AGA A A T ATG GGT GGC ATA 6 2 4 6 

Leu Lys Gly Asp Asn Glu Lys His Ser Pro Arg Asn Met Gly Gly lie 

2060 2065 2070 

TTA GGT G A A GAT CTG ACA CTT GAT TTG AAA GAT ATA CAG AGA CCA GAT 6294 

Leu Gly Glu Asp Leu Thr Leu Asp Leu Lys Asp lie Gin Arg Pro Asp 

2075 2080 2085 

TCA G A A CAT GGT CTA TCC CCT GAT TCA G A A A AT TTT GAT TGG AAA GCT 6342 

Ser Glu His Gly Leu Ser Pro Asp Ser Glu Asn Phe Asp Trp Lys Ala 

2090 2095 2100 

ATT CAG GAA GGT GCA A A T TCC ATA GTA AGT AGT TTA CAT CAA GCT GCT 6 3 9 0 

lie Gin Glu Gly Ala Asn Sei lie Val Ser Ser Leu His Gin Ala Ala 

2105 2110 2115 

GCT GCT GCA TGT TTA TCT AGA CAA GCT TCG TCT GAT TCA GAT TCC AT C 6 4 3 8 

Ala Ala Ala Cys Leu Ser Arg Gin Ala Ser Ser Asp Ser Asp Ser lie 

2120 2125 2130 2135 

CTT TCC CTG AAA TCA GGA AT C TCT CTG GGA TCA CCA TTT CAT CTT ACA 648 6 

Leu Ser Leu Lys Ser Gly lie Ser Leu Gly Ser Pro Phe His Leu Thr 

2140 2145 2150 

CCT GAT CAA GAA GAA AAA CCC TTT ACA AGT A AT AAA GGC CCA CGA ATT 653 4 

Pro Asp Gin Glu Glu Lys Pro Phe Thr Ser Asn Lys Gly Pro Arg lie 

2155 2160 2165 

CTA AAA CCA GGG GAG AAA AGT ACA TTG GAA ACT AAA AAG ATA GAA TCT 65 8 2 

Leu Lys Pro Gly Glu Lys Ser Thr Leu Glu Thr Lys Lys lie Glu Ser 

2170 2175 2180 

GAA AGT AAA GGA ATC AAA GGA GGA AAA AAA GTT TAT AAA AGT TTG ATT 66 30 

Glu Ser Lys Gly lie Lys Gly Gly Lys Lys Val Tyr Lys Ser Leu lie 

2185 2190 2195 

ACT GGA AAA GTT CGA TCT AAT TCA GAA ATT TCA GGC CAA ATG AAA CAG 667 8 

Thr Gly Lys Val Arg Ser Asn Ser Glu lie Ser Gly Gin Met Lys Gin 
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CCC CTT C A A GCA AAC ATG CCT TCA AT C TCT CGA GGC AG G ACA ATG ATT 6 7 2 6 

Pro Leu Gin Ala Ass Met Pro Ser lie Scr Arg Gly Arg Thr Met lie 

2220 2225 2230 

CAT ATT CCA GGA GTT CGA A A T AG C TCC TCA AGT ACA AGT CCT GTT TCT 6 7 7 4 

His lie Pro Gly Val Arg Asn Ser Ser Ser Scr Thr Ser Pro Val Ser 

2235 2240 2245 

AAA AAA GGC CCA CCC CTT A A G ACT CCA GCC TCC AAA AG C CCT AGT G A A 6 8 2 2 

Lys Lys Gly Pro Pro Leu Lys Thr Pro Ala Ser Lys Ser Pro Sei Glu 

2250 2255 2260 

GGT CAA ACA GCC ACC ACT TCT CCT AG A GGA GCC A AG CCA TCT GTG AAA 6 8 7 0 

Gly Gin Thr Ala Thr Thr Ser Pro Arg Gly Ala Lys Pro Ser Val Lys 

2265 2270 2275 

TCA G A A TTA AG C CCT GTT GCC AGG CAG ACA TCC CAA ATA GGT GGG TCA 6918 

Ser Glu Leu Ser Pro Val Ala Arg Gin Thr Ser Gin lie Gly Gly Ser 

2280 2285 2290 2295 

AGT AAA GCA CCT TCT AGA TCA GGA TCT AGA GAT TCG ACC CCT TCA AGA 6966 

Ser Lys Ala Pro Ser Arg Ser Gly Ser Arg Asp Ser Thr Pro Ser Arg 

2300 2305 2310 

CCT GCC CAG CAA CCA TTA AGT AGA CCT ATA CAG TCT CCT GGC CGA AAC 70 14 

Pro Ala Gin Gin Pro Leu Ser Arg Pro lie Gin Ser Pro Gly Arg Asn 

2315 2320 2325 

TCA ATT TCC CCT GGT AGA A AT GGA ATA AGT CCT CCT AAC AAA TTA TCT 7062 

Ser lie Ser Pro Gly Arg Asn Gly lie Ser Pro Pro Asn Lys Leu Ser 

2330 2335 2340 

CAA CTT CCA AGG ACA TCA TCC CCT AGT ACT GCT TCA ACT A AG TCC TCA 7110 

Gin Leu Pro Arg Thr Ser Ser Pro Ser Thr Ala Ser Thr Lys Ser Ser 

2345 2350 2355 

GGT TCT GGA AAA ATG TCA TAT ACA TCT CCA GGT AGA CAG ATG AGC CAA 7 15 8 

Gly Ser Gly Lys Met Ser Tyr Thr Ser Pro Gly Arg Gin Met Ser Gin 

2360 2365 2370 2375 

CAG AAC CTT ACC AAA CAA ACA GGT TTA TCC A AG A A T GCC AGT AGT ATT 7 2 0 6 

Gin Asn Leu Thr Lys Gin Thr Gly Leu Ser Lys Asn Ala Ser Ser lie 

2380 2385 2390 

CCA AGA AGT GAG TCT GCC TCC AAA GGA CTA A AT CAG ATG A A T AAT GGT 7 2 5 4 

Pro Arg Ser Glu Ser Ala Ser Lys Gly Leu Asn Gin Met Asn Asn Gly 

2395 2400 2405 

AAT GGA GCC AAT AAA A AG GTA GAA CTT TCT AGA ATG TCT TCA ACT AAA 7302 

Asn Gly Ala Asn Lys Lys Val Glu Leu Ser Arg Met Ser Ser Thr Lys 

2410 2415 2420 

TCA AGT GGA AGT GAA TCT GAT AGA TCA GAA AGA CCT GTA TTA GTA CGC 7350 

Ser Ser Gly Ser Glu Ser Asp Arg Ser Glu Arg Pro Val Leu Val Arg 

2425 2430 2435 

CAG TCA ACT TTC ATC AAA GAA GCT CCA AGC CCA ACC TTA AGA AGA AAA 7398 

Gin Ser Thr Phe lie Lys Gtu Ala Pro Ser Pro Thr Leu Arg Arg Lys 

2440 2445 2450 2455 

TTG GAG GAA TCT GCT TCA TTT GAA TCT CTT TCT CCA TCA TCT AGA CCA 7446 

Leu Glu Glu Ser Ala Ser Phe Glu Ser Leu Ser Pro Ser Ser Arg Pro 

2460 2465 2470 

GCT TCT CCC ACT AGG TCC CAG GCA CAA ACT CCA GTT TTA AGT CCT TCC 7494 

Ala Ser Pro Thr Arg Ser Gin Ala Gin Thr Pro Val Leu Ser Pro Ser 

2475 2480 2485 

CTT CCT GAT ATG TCT CTA TCC ACA CAT TCG TCT GTT CAG GCT GGT GGA 7542 

Leu Pro Asp Met Ser Leu Ser Thr His Ser Ser Val Gin Ala Gly Gly 

2490 2495 2500 

TGG CGA AAA CTC CCA CCT AAT CTC AGT CCC ACT ATA GAG TAT AAT GAT 7590 

Trp Arg Lys Leu Pro Pro Asn Leu Ser Pro Thr lie Glu Tyr Asn Asp 

2505 2510 2515 

GGA AGA CCA GCA A AG CGC CAT GAT ATT GCA CGG TCT CAT TCT GAA AGT 7638 

Gly Arg Pro Ala Lys Arg His Asp lie Ala Arg Ser His Ser Glu Ser 

2520 2525 2530 2535 
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CCT TCT AG A CTT CCA ATC A A T AGG TCA GGA ACC TGG AAA CGT GAG CAC 7 6 8 6 

Pro Set Arg Leu Pro lie Asn Arg Ser Gly Thr Trp Lys Aig Glu His 

2540 2545 2550 

AG C AAA CAT TCA TCA TCC CTT CCT CGA GTA AGC ACT TGG AG A AG A ACT 7 7 3 4 

Ser Lys His Ser Ser Ser Leu Pro Arg Val Ser Thr Trp Arg Arg Thr 

2555 2560 2565 

GGA AG T TCA TCT TCA ATT CTT TCT GCT TCA TCA G A A TCC AGT G A A AAA 7 7 8 2 

Gly Ser Ser Ser Ser lie Leu Ser Ala Ser Ser Glu Ser Sex Glu Lys 

2570 2575 2580 

GCA AAA AGT GAG GAT G A A AAA CAT GTG AAC TCT ATT TCA GGA ACC AAA 7830 

Ala Lys Ser Glu Asp Glu Lys His Val Asn Ser lie Ser Gly Thr Lys 

2585 2590 2595 

CAA AGT AAA G A A AAC CAA GTA TCC GCA AAA GGA ACA TGG AGA AAA ATA 7 8 7 8 

Gin Ser Lys Glu Asn Gin Val Ser Ala Lys Gly Thr Trp Arg Lys lie 

2600 2605 2610 2615 

AAA G A A A A T GAA TTT TCT CCC ACA AAT AGT ACT TCT C AG ACC GTT TCC 7 9 2 6 

Lys Glu Asn Glu Phe Ser Pro Thr Asn Ser Thr Ser Gin Thr Val Ser 

2620 2625 2630 

TCA GGT GCT ACA AAT GGT GCT GAA TCA A AG ACT CTA ATT TAT CAA AT G 7 9 7 4 

Ser Gly Ala Thr Asn Gly Ala Glu Ser Lys Thr Leu lie Tyr Gin Met 

2635 2640 2645 

GCA CCT GCT GTT TCT AAA ACA GAG GAT GTT TGG GTG AGA ATT GAG GAC 8022 

Ala Pro Ala Val Ser Lys Thr Glu Asp Val Trp Val Arg lie Glu Asp 

2650 2655 2660 

TGT CCC ATT AAC AAT CCT AGA TCT GGA AGA TCT CCC ACA GGT AAT ACT 8070 

Cys Pro lie Asn Asn Pro Arg Ser Gly Arg Ser Pro Thr Gly Asn Thr 

2665 2670 2675 

CCC CCG GTG ATT GAC AGT GTT TCA GAA A AG GCA AAT CCA AAC ATT AAA 8118 

Pro Pro Val lie Asp Ser Val Ser Glu Lys Ala Asn Pro Asn lie Lys 

2680 2685 2690 2695 

GAT TCA AAA GAT AAT CAG GCA AAA CAA AAT GTG GGT AAT GGC AGT GTT 8166 

Asp Ser Lys Asp Asn Gin Ala Lys Gin Asn Val Gly Asn Gly Ser Val 

2700 2705 2710 

CCC AT G CGT ACC GTG GGT TTG GAA AAT CGC CTG ACC TCC TTT ATT CAG 82 14 

Pro Met Arg Thr Val Gly Leu Glu Asn Arg Leu Thr Ser Phe lie Gin 

2715 2720 2725 

GTG GAT GCC CCT GAC CAA AAA GGA ACT GAG ATA AAA CCA GGA CAA AAT 8262 

Val Asp Ala Pro Asp Gin Lys Gly Thr Glu lie Lys Pro Gly Gin Asn 

2730 2735 2740 

AAT CCT GTC CCT GTA TCA GAG ACT AAT GAA AGT CCT ATA GTG GAA CGT 83 10 

Asn Pro Val Pro Val Ser Glu Thr Asn Glu Ser Pro lie Val Glu Aig 

2745 2750 2755 

ACC CCA TTC AGT TCT AGC AGC TCA AGC AAA CAC AGT TCA CCT AGT GGG 83 5 8 

Thr Pro Phe Ser Ser Ser Ser Ser Ser Lys His Ser Ser Pro Ser Gly 

2760 2765 2770 2775 

ACT GTT GCT GCC AGA GTG ACT CCT TTT AAT TAC AAC CCA AGC CCT AGG 8406 

Thr Val Ala Ala Arg Val Thr Pro Phe Asn Tyr Asn Pro Ser Pro Arg 

2780 2785 2790 

AAA AGC AGC GCA GAT AGC ACT TCA GCT CGG CCA TCT CAG ATC CCA ACT 8454 

Lys Ser Ser Ala Asp Ser Thr Ser Ala Arg Pro Ser Gin lie Pro Thr 

2795 2800 2805 

CCA GTG AAT AAC AAC ACA A AG A AG CGA GAT TCC AAA ACT GAC AGC ACA 8502 

Pro Val Asn Asn Asn Thr Lys Lys Arg Asp Ser Lys Thr Asp Ser Thr 

2810 2815 2820 

GAA TCC AGT GGA ACC CAA AGT CCT A AG CGC CAT TCT GGG TCT TAC CTT 8550 

Glu Ser Ser Gly Thr Gin Ser Pro Lys Arg His Ser Gly Ser Tyr Leu 

2825 2830 2835 



GTG ACA TCT GTT TAAAAGAGAG GAAGAATGAA ACT AAGAAA A TTCTATGTTA 8602 
Val Thr Ser Val 
2 8 4 0 
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ATTACAACTG C T AT A T AG AC ATTTTGTTTC AAATGAAACT TTAAAAGACT GAAAAATTTT 8 6 6 2 

GTAAATAGGT TTGATTCTTG TTAGAGGGTT TTTGTTCT GG AAGCCATATT TGATAGTATA 8 7 2 2 

CTTTGTCTTC ACTGGTCTTA TTTTGGGAGG CACTCTTGAT GGTTAGGAAA AAATAGAAAG 8782 

CCAAGTATGT TTGTACAGTA TGTTTTACAT GTATTTAAAG TAGCATCCCA TCCCAACTTC 8842 

CTTAATTATT GCTTGTCTAA AATAATGAAC ACTACAGATA GGAAATATGA TATATTGCTG 8902 

TTATCAATCA TTTCTAGATT ATAAACTGAC TAAACTTACA TCAGGGGAAA ATTGGTATTT 8962 

ATGCAAAAAA AAAATGTTTT TGTCCTTGTG AGTCCATCTA ACATCATAAT TAATCATGTG 9022 

GCTGTGAAAT TCACAGTAAT ATGGT TCCCG ATGAACAAGT TTACCCAGCC TGCTTTGCTT 9 0 8 2 

ACTGCATGAA TGAAACTGAT GGTTCAATTT CAGAAGTAAT GATTAACAGT TATGTGGTCA 9 142 

CATGATGTGC ATAGAGATAG CTACAGTGTA ATAATTTACA CTATTTTGTG CTCCAAACAA 9202 

AACAAAAATC TGTGTAACTG TAAAACATTG AATGAAACTA TTTTACCTGA ACT AG A T T T T 9 2 6 2 

ATCTGAAAGT AGGTAGAATT TTTGCTATGC TGT AATTTGT TGT AT ATTCT GG T AT T TG AG 9 3 2 2 

GTGAGATGGC TGCTCTTTAT TAATGAGACA TGA AT TGTGT CTCAACAGAA ACTAAATGAA 9 3 8 2 

CATTTCAGAA T A A AT TAT TG CTGTATGTAA ACTGTTACTG AAATTGGTAT TTGTTTGAAG 9 44 2 

GGTTTGTTTC ACATTTGTAT TAATTAATTG TTTAAAATGC CTCTTTTAAA AG C T T A T AT A 9 5 0 2 

AATTTTTTCT TCAGCTTCTA TGCATTAAGA GTAAAATTCC TCTTACTGTA ATAAAAACAT 9562 

TGAAGAAGAC TGTTGCCACT TAACCATTCC ATGCGTTGGC AC T T 9 6 0 6 

( 2 > INFORMATION FOR SEQ ID NOO: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 2843 amino adds 
( B ) TYPE: amino acid 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: protein 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO*2: 

Met Ala Ala Ala Scr Tyr Asp Gin Leu Leu Lys Gin Val Glu Ala Leo 

1 5 10 15 

Lys Met OU Asn Ser Asn Leu Arg Gin Glu Leu Glu Asp Asn Ser Asn 

2 0 2 5 3 0 

His Leu Thr Lys Leu Glu Thr Glu Ala Ser Asn Met Lys Glu Val Leu 

3 5 4 0 4 5 

Lys Gin Leu Gin Gly Ser lie Glu Asp Glu Ala Met Ala Ser Ser Gly 

5 0 5 5 6 0 

Gin lie Asp Leu Leu Glu Arg Leu Lys Glu Leu Asn Leu Asp Ser Ser 

65 70 75 80 

Asn Phe Pro Gly Val Lys Leu Arg Ser Lys Met Ser Leu Arg Ser Tyr 

8 5 9 0 9 5 

Gly Ser Arg Glu Gly Ser Val Ser Ser Arg Ser Gly Glu Cys Ser Pro 

10 0 10 5 110 

Val Pro Met Gly Ser Phe Pro Arg Arg Gly Phe Val Asn Gly Ser Arg 

115 12 0 12 5 

Glu Ser Thr Gly Tyr Leu Glu Glu Leu Glu Lys Glu Arg Ser Leu Leu 

13 0 13 5 14 0 

Leu Ala Asp Leu Asp Lys Glu Glu Lys Glu Lys Asp Trp Tyr Tyr Ala 

145 150 155 160 

Gin Leu Gin Asn Leu Thr Lys Arg lie Asp Ser Leu Pro Leu Thr Glu 

16 5 17 0 17 5 

Asn Phe Ser Leu Gin Thr Asp Leu Thr Arg Arg Gin Leu Glu Tyr Glu 



53 
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Ala Arg Gin lie 
1 9 5 

Asp Met Glu Ly s 
2 1 0 

Glu Lys Asp lie 
2 2 5 

Glu Ala Glu Arg 



Ala Glu Arg Gin 

2 6 0 

Thi Ser Gly Asn 
2 7 5 

Ala Ser Val Leu 
2 9 0 

Thr Ser His Leu 
3 0 5 

Me t Leu Gly Thr 



Met Ser Ser Ser 
3 4 0 

Leu Pro Leu Leu 

3 5 5 

Leu Leu Gly Asn 
3 7 0 

Ala Ala Leu His 
3 8 5 

Arg Arg Glu lie 



Cys Glu Thr Cys 
4 20 

Gin Asp Lys Asn 
43 5 

Ala Val Cys Val 
4 5 0 

A 1 a Me t Asn Glu 
4 6 5 

Val Asp Cys Glu 



Leu Arg Arg Tyr 
5 0 0 

Val Ala Asn Lys 
5 1 5 

Leu Val Ala Gin 
5 3 0 

Ala Ser Val Leu 
5 4 5 

Lys Thr Leu Arg 



Leu Glu Val Lys 
5 8 0 



Arg Val Ala Met 

2 0 0 

Arg Ala Gin Arg 
2 1 5 

Leu Arg lie Arg 
2 3 0 

Ser Ser Gin Asn 
2 4 5 

Asn Glu Gly Gin 



Gly Gin Gly Ser 
2 8 0 

Ser Ser Ser Ser 
2 9 5 

Gly Thr Lys Val 
3 1 0 

His Asp Lys Asp 
3 2 5 

Gin Asp Ser Cys 



lie Gin Leu Leu 

3 6 0 

% 

Ser Arg Gly Ser 
3 7 5 

Asn lie lie His 

3 9 0 

Arg Val Leu His 
4 0 5 

Tip Glu Trp Gin 



Pro Me t Pro Ala 
440 

Leu Met Lys Leu 
4 5 5 

Leu Gly Gly Leu 
4 7 0 

Me t Tyr Gly Leu 
4 8 5 

Ala Gly Met Ala 



Ala Thr Leu Cys 
5 2 0 

Leu Lys Ser Glu 

5 3 5 

Arg Asn Leu Ser 
5 5 0 

Glu Val Gly Ser 
5 6 5 

Lys Glu Ser Thr 



1 8 5 

Glu Glu Gin Leu 



Arg lie Ala Arg 

2 2 0 

Gin Leu Leu Gin 

2 3 5 

Lys His Glu Thr 
2 5 0 

Gly Val Gly Glu 
2 6 5 

Thr Thr Arg Met 



Thr His Ser Ala 
3 0 0 

Glu Met Val Tyr 
3 1 5 

Asp Met Ser Arg 

3 3 0 

lie Ser Met Arg 
3 4 5 

His Gly Asn Asp 



Lys Glu Ala Arg 
3 8 0 

Ser Gin Pro Asp 

3 9 5 

Leu Leu Glu Gin 
4 1 0 

Glu Ala His Glu 
4 2 5 

Pro Val Glu His 



Ser Phe Asp Glu 
4 6 0 

Gin Ala lie Ala 
4 7 5 

Thr Asn Asp His 
4 9 0 

Leu Thr Asn Leu 
5 0 5 

Ser Met Lys Gly 



Ser Glu Asp Leu 
5 4 0 

Trp Arg Ala Asp 
5 5 5 

Val Lys Ala Leu 
5 7 0 

Leu Lys Ser Val 
5 8 5 



Gly Thr Cys Gin 
2 0 5 

lie Gin Gin lie 



Ser Gin Ala Thr 
2 4 0 

Gly Ser His Asp 
2 5 5 

lie Asn Met Ala 
2 7 0 

Asp His Glu Thr 
2 8 5 

Pro Arg Arg Leu 



Ser Leu Leu Ser 
3 2 0 

Thr Leu Leu Ala 

3 3 5 

Gin Ser Gly Cys 
3 5 0 

Lys Asp Ser Val 

3 6 5 

Ala Arg Ala Ser 



Asp Lys Arg Gly 
4 0 0 

lie Arg Ala Tyr 
4 15 

Pro Gly Met Asp 
43 0 

Gin lie Cys Pro 
4 4 5 

Glu His Arg His 



Glu Leu Leu Gin 
4 8 0 

Tyr Ser lie Thr 
4 9 5 

Thr Phe Gly Asp 
5 1 0 

Cys Met Arg Ala 

5 2 5 

Gin Gin Val lie 



Val Asn Ser Lys 
5 6 0 

Me t Glu Cys Ala 
5 7 5 

Leu Ser Ala Leu 
5 9 0 



Trp Asn Leu Ser Ala His Cys Thr Glu Asn Lys Ala Asp lie Cys Ala 
595 600 605 
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Val Asp Gly Ala Leu Ala Phc Leu Val Gly Thr Lea Thr Tyr Aig Ser 
6 10 6 15 6 2 0 

Gin Thr Asa Thr Leu Ala lie lie Glu Ser Gly Gly Gly lie Leu Arg 
625 630 635 640 

As n Val Ser Ser Leu lie Ala Thr Asa Glu Asp His Arg Gin lie Leu 
645 650 655 

Arg Glu Asn Asn Cys Leu Gin Thr Leu Leu Gin His Leu Lys Ser His 
660 665 670 

Ser Leu Thr lie Val Ser Asn Ala Cys Gly Thr Leu Trp Asn Leu Ser 
675 680 685 

Ala Arg Asn Pro Lys Asp Gin Glu Ala Leu Trp Asp Met Gly Ala Val 
690 695 700 

Ser Met Leu Lys Asn Leu lie His Ser Lys His Lys Met lie Ala Met 
705 710 715 720 

Gly Ser Ala Ala Ala Leu Arg Asn Leu Met Ala Asn Arg Pro Ala Lys 
725 730 735 

Tyr Lys Asp Ala Asn lie Met Ser Pro Gly Ser Ser Leu Pro Ser Leu 
740 745 750 

His Val Arg Lys Gin Lys Ala Leu Glu Ala Glu Leu Asp Ala Gin His 
755 760 765 

Leu Ser Glu Thr Phc Asp Asn lie Asp Asn Leu Ser Pro Lys Ala Ser 
770 775 780 

His Arg Ser Lys Gin Arg His Lys Gin Ser Leu Tyr Gly Asp Tyr Val 
785 790 795 800 

Phe Asp Thr Asn Arg His Asp Asp Asn Arg Ser Asp Asn Phe Asn Thr 
8 0 5 8 10 8 15 

Gly Asn Met Thr Val Leu Ser Pro Tyr Leu Asn Thr Thr Val Leu Pro 
820 825 830 

Ser Ser Ser Ser Ser Arg Gly Ser Leu Asp Ser Ser Arg Ser Glu Lys 
835 840 845 

Asp Arg Ser Leu Glu Arg Glu Arg Gly lie Gly Leu Gly Asn Tyr His 
850 855 860 

Pro Ala Thr Glu Asn Pro Gly Thr Ser Ser Lys Arg Gly Leu Gin lie 
865 870 875 880 

Ser Thr Thr Ala Ala Gin lie Ala Lys Val Met Glu Glu Val Ser Ala 
885 890 895 

lie His Thr Ser Gin Glu Asp Arg Ser Ser Gly Ser Thr Thr Glu Leu 
900 905 910 

His Cys Val Thr Asp Glu Arg Asn Ala Leu Arg Arg Ser Ser Ala Ala 
915 920 925 

His Thr His Ser Asn Thr Tyr Asn Phe Thr Lys Ser Glu Asn Ser Asn 
930 935 940 

Arg Thr Cys Ser Met Pro Tyr Ala Lys Leu Glu Tyr Lys Arg Ser Ser 
945 950 955 960 

Asn Asp Ser Leu Asn Ser Val Ser Ser Asn Asp Gly Tyr Gly Lys Arg 
965 970 975 

Gly Gin Met Lys Pro Ser lie Glu Ser Tyr Ser Glu Asp Asp Glu Ser 
980 985 990 

Lys Phe Cys Ser Tyr Gly Gin Tyr Pro Ala Asp Leu Ala His Lys lie 
995 1000 1005 

His Ser Ala Asn His Met Asp Asp Asn Asp Gly Glu Leu Asp Thr Pro 
1010 1015 1020 

lie Asn Tyr Ser Leu Lys Tyr Ser Asp Glu Gin Leu Asn Ser Gly Arg 
1025 1030 1035 1040 
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Gin Sei Pro Ser Gin Asn Gin Arg Trp Ala Arg Pro Lys His lie lie 

1045 1050 1055 

Glu Asp GI u lie Lys Gin Ser Glu Gin Arg Gin Ser Arg Asn Gin Ser 

1060 1065 1070 

Thr Thr Tyr Pro Val Tyi Thr Glu Ser Thr Asp Asp Lys His Leu Lys 

1075 1080 1085 



Phe Gin Pro His Phe Gly Gin Gin Glu Cys Val Ser Pro Tyr Arg Ser 



10 9 0 



10 9 5 



110 0 



Arg Gly Ala Asn Gly Ser Glu Thr Asn Arg Val Gly Ser Asn His Gly 



110 5 



1110 



1115 



112 0 



lie Asn Gin Asn Val Ser Gin Ser Leu Cys Gin Glu Asp Asp Tyr Glu 

1125 1130 1135 

Asp Asp Lys Pro Thr Asn Tyr Ser Glu Arg Tyr Ser Glu Glu Glu Gin 

1140 1145 1150 



His Glu Glu Glu Glu Arg Pro Thr Asn Tyr Ser lie Lys Tyr Asn Glu 



115 5 



116 0 



Glu Lys Arg His Val Asp Gin Pro lie Asp Tyr Sei Leu Lys Tyr Ala 
1170 1175 1180 



Thr Asp lie Pro Ser Ser Gin Lys Gin Ser Phe Ser Phe Ser Lys Ser 



118 5 



119 0 



12 0 0 



Ser Ser Gly Gin Ser Ser Lys Thr Glu His Met Ser Ser Ser Ser Glu 
1205 1210 1215 

Asn Thr Ser Thr Pro Ser Ser Asn Ala Lys Arg Gin Asn Gin Leu His 
1220 1225 1230 

Pro Ser Ser Ala Gin Ser Arg Ser Gly Gin Pro Gin Lys Ala Ala Thr 
1235 1240 1245 

Cys Lys Val Ser Ser lie Asn Gin Glu Thr lie Gin Thr Tyr Cys Val 
1250 1255 1260 



Glu Asp Thr Pro lie Cys Phe Ser Arg Cys Ser Ser Leu Ser Ser Leu 



12 6 5 



127 5 



12 8 0 



Ser Ser Ala Glu Asp Glu lie Gly Cys Asn Gin Thr Thr Gin Glu Ala 



12 8 5 



12 9 0 



12 9 5 



Asp Ser Ala Asn Thr Leu Gin lie Ala Glu lie Lys Gly Lys lie Gly 
1300 1305 1310 

Thr Arg Ser Ala Glu Asp Pro Val Ser Glu Val Pro Ala Val Ser Gin 
1315 1320 1325 

His Pro Arg Thr Lys Ser Ser Arg Leu Gin Gly Ser Ser Leu Ser Ser 
1330 1335 1340 

Glu Ser Ala Arg His Lys Ala Val Glu Phe Pro Ser Gly Ala Lys Ser 
1345 1350 1355 1361 



Pro Ser Lys Ser Gly Ala Gin Thr Pro Lys Ser Pro Pro Glu His Tyr 



13 6 5 



13 7 0 



13 7 5 



Val Gin Glu Thr Pro Leu Met Phe Ser Arg Cys Thr Ser Val Ser Ser 
1380 1385 1390 



Leu Asp Ser Phe Glu Ser Arg Ser lie Ala Ser Ser Val Gin Ser Glu 



13 9 5 



14 0 0 



14 0 5 



Pro Cys Ser Gly Met Val Ser Gly lie lie Ser Pro Ser Asp Leu Pro 
1410 1415 1420 

Asp Ser Pro Gly Gin Thr Met Pro Pro Ser Arg Ser Lys Thr Pro Pro 
1425 1430 1435 1 4 4 < 



Pro Pro Pro Gin Thr Ala Gin Thr Lys Arg Glu Val Pro Lys Asn Lys 



Ala Pro Thr Ala 



Glu Lys Arg Glu Ser 



Gly Pro Lys Gin Ala Ala Val 
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1460 1465 1470 

Asn Ala Ala Val Gin Arg Val Gin Val Leu Pro Asp Ala Asp Thr Leu 
1475 1480 1485 

Leu His Phe Ala Thi Glu Ser Thr Pro Asp Gly Phe Ser Cys Ser Ser 
1490 1495 1500 

Ser Leu Ser Ala Leu Ser Leu Asp Glu Pio Phe lie Gin Lys Asp Val 
1505 1510 1515 1520 

Glu Leu Arg lie Met Pro Pro Val Gin Glu Asn Asp Asn Gly Asn Glu 
1525 1530 1535 

Thr Glu Ser Glu Gin Pro Lys Glu Ser Asn Glu Asn Gin Glu Lys Glu 
1540 1545 1550 

Ala Glu Lys Thr lie Asp Ser Glu Lys Asp Leu Leu Asp Asp Ser Asp 
1555 1560 1565 

Asp Asp Asp lie Glu lie Leu Glu Glu Cys lie lie Ser Ala Met Pro 
1570 1575 1580 

Thr Lys Ser Ser Arg Lys Gly Lys Lys Pro Ala Gin Thr Ala Ser Lys 
1585 1590 1595 1600 

Leu Pro Pro Pro Val Ala Arg Lys Pro Ser Gin Leu Pro Val Tyr Lys 
1605 1610 1615 

Leu Leu Pro Ser Gin Asn Arg Leu Gin Pro Gin Lys His Val Ser Phe 
1620 1625 1630 

Thr Pro Gly Asp Asp Met Pro Arg Val Tyr Cys Val Glu Gly Thr Pro 
1635 1640 1645 

lie Asn Phe Ser Thr Ala Thr Ser Leu Ser Asp Leu Thr lie Glu Ser 
1650 1655 1660 

Pro Pro Asn Glu Leu Ala Ala Gly Glu Gly Val Arg Gly Gly Ala Gin 
1665 1670 1675 1680 

Ser Gly Glu Phe Glu Lys Arg Asp Thr lie Pro Thr Glu Gly Arg Ser 
1685 1690 1695 

Thr Asp Glu Ala Gin Gly Gly Lys Thr Ser Ser Val Thr lie Pro Glu 
1700 1705 1710" 

Leu Asp Asp Asn Lys Ala Glu Glu Gly Asp lie Leu Ala Glu Cys lie 
1715 1720 1725 

Asn Ser Ala Met Pro Lys Gly Lys Ser His Lys Pro Phe Arg Val Lys 
1730 1735 1740 

Lys lie Met Asp Gin Val Gin Gin Ala Ser Ala Ser Ser Ser Ala Pro 
1745 1750 1755 1760 

Asn Lys Asn Gin Leu Asp Gly Lys Lys Lys Lys Pro Thr Ser Pro Val 
1765 1770 1775 

Lys Pro lie Pro Gin Asn Thr Glu Tyr Arg Thr Arg Val Arg Lys Asn 
1780 1785 1790 

Ala Asp Ser Lys Asn Asn Leu Asn Ala Glu Arg Val Phe Ser Asp Asn 
1795 1800 1805 

Lys Asp Ser Lys Lys Gin Asn Leu Lys Asn Asn Ser Lys Asp Phe Asn 
1810 1815 1820 

Asp Lys Leu Pro Asn Asn Glu Asp Arg Val Arg Gly Ser Phe Ala Phe 
1825 1830 1835 1840 

Asp Ser Pro His His Tyr Thr Pro lie Glu Gly Thr Pro Tyr Cys Phe 
1845 1850 1855 

Ser Arg Asn Asp Ser Leu Ser Ser Leu Asp Phe Asp Asp Asp Asp Val 
1860 1865 1870 

Asp Leu Ser Arg Glu Lys Ala Glu Leu Arg Lys Ala Lys Glu Asn Lys 
1875 1880 1885 
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Glu Set Glu Ala Lys Val Thr Ser His Thr Glu Leu Thr Set Aso Gin 
1890 1895 1900 

Gin Ser Ala Asa Lys Thr Gin Ala lie Ala Lys Gin Pro lie Asn Arg 
1905 1910 1915 1920 

Gly Gin Pro Lys Pro lie Leu Gin Lys Gin Ser Thr Phe Pro Gin Ser 
1925 1930 1935 

Ser Lys Asp lie Pro Asp Arg Gly Ala Ala Thr Asp Glu Lys Leu Gin 
1940 1945 1950 

Asn Phe Ala lie Glu Asn Thr Pro Val Cys Phe Ser His Asn Ser Ser 
1955 1960 1965 

Leu Ser Ser Leu Ser Asp lie Asp Gin Glu Asn Asn Asn Lys Glu Asn 
1970 1975 1980 

Glu Pro lie Lys Glu Thr Glu Pro Pro Asp Ser Gin Gly Glu Pro Ser 
1985 1990 1995 2000 

Lys Pro Gin Ala Ser Gly Tyr Ala Pro Lys Ser Phe His Val Glu Asp 
2005 2010 2015 

Thr Pro Val Cys Phe Ser Arg Asn Ser Ser Leu Ser Ser Leu Set lie 
2020 2025 2030 

Asp Ser Glu Asp Asp Leu Leu Gin Glu Cys lie Ser Ser Ala Met Pro 
2035 2040 2045 

Lys Lys Lys Lys Pro Ser Arg Leu Lys Gly Asp Asn Glu Lys His Ser 
2050 2055 2060 

Pro Arg Asn Met Gly Gly lie Leu Gly Glu Asp Leu Thr Leu Asp Leu 
2065 2070 2075 2080 

Lys Asp lie Gin Arg Pro Asp Ser Glu His Gly Leu Ser Pro Asp Ser 
2085 2090 2095 

Glu Asn Phe Asp Tip Lys Ala lie Gin Glu Gly Ala Asn Ser lie Val 
2100 2105 2110 

Ser Sex Leu His Gin Ala Ala Ala Ala Ala Cys Leu Ser Arg Gin Ala 
2115 2120 2125 

Ser Ser Asp Ser Asp Ser lie Leu Ser Leu Lys Ser Gly lie Ser Leu 
2130 2135 2140 

Gly Ser Pro Phe His Leu Thr Pro Asp Gin Glu Glu Lys Pro Phe Thr 
2145 2150 2155 2160 

Ser Asn Lys Gly Pro Arg lie Leu Lys Pro Gly Glu Lys Ser Thr Leu 
2165 2170 2175 

Glu Thr Lys Lys lie Glu Ser Glu Ser Lys Gly lie Lys Gly Gly Lys 
2180 2185 2190 

Lys Val Tyr Lys Ser Leu lie Thr Gly Lys Val Arg Ser Asn Ser Glu 
2195 2200 2205 

lie Ser Gly Gin Met Lys Gin Pro Leu Gin Ala Asn Met Pro Ser lie 
2210 2215 2220 

Ser Arg Gly Arg Thr Met lie His lie Pro Gly Val Arg Asn Ser Ser 
2225 2230 2235 2240 

Ser Ser Thr Ser Pro Val Ser Lys Lys Gly Pro Pro Leu Lys Thr Pro 
2245 2250 2255 

Ala Ser Lys Ser Pro Ser Glu Gly Gin Thr Ala Thr Thr Ser Pro Arg 
2260 2265 2270 

Gly Ala Lys Pro Ser Val Lys Ser Glu Leu Ser Pro Val Ala Arg Gin 
2275 2280 2285 

Thr Ser Gin lie Gly Gly Ser Ser Lys Ala Pro Ser Arg Ser Gly Ser 
2290 2295 2300 



Arg Asp Ser Thr Pro Ser Arg Pro Ala Gin Gin Pro Leu Ser Arg Pro 
2305 2310 2315 2320 
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lie Gin Set Pro Gly Arg Ass Su lie Ser Pro Gly Arg Asn Gly lie 
2325 2330 2335 

Ser Pro Pro Asn Lys Leu Ser Gin Leu Pro Arg Thr Ser Ser Pro Sei 
2340 2345 2350 

Thr Ala Ser Thr Lys Ser Ser Gly Ser Gly Lys Met Ser Tyr Thr Ser 
2355 2360 2365 

Pro Gly Arg Gin Met Ser Gin Gin Asn Leu Thr Lys Gin Thr Gly Leu 
2370 2375 2380 

Ser Lys Asa Ala Ser Ser lie Pro Arg Ser Glu Ser Ala Ser Lys Gly 
23S5 2390 2395 2400 

Leu Asn Gin Met Asn Asn Gly Asn Gly Ala Asn Lys Lys Val Glu Leu 
2405 2410 2415 

Ser Arg Met Ser Ser Thr Lys Ser Ser Gly Ser Glu Ser Asp Arg Ser 
2420 2425 2430 

Glu Arg Pro Val Leu Val Arg Gin Ser Thr Phe lie Lys Glu Ala Pro 
2435 2440 2445 

Ser Pro Thr Leu Arg Arg Lys Leu Glu Glu Ser Ala Ser Phe Glu Ser 
2450 2455 2460 

Leu Ser Pro Ser Ser Arg Pro Ala Ser Pro Thr Arg Ser Gin Ala Gin 
2465 2470 2475 2480 

Thr Pro Val Leu Ser Pro Ser Leu Pro Asp Met Ser Leu Ser Thr His 
2485 2490 2495 

Ser Ser Val Gin Ala Gly Gly Tip Arg Lys Leu Pro Pro Asn Leu Ser 
2500 2505 2510 

Pro Thr lie Glu Tyr Asn Asp Gly Arg Pro Ala Lys Arg His Asp lie 
2515 2520 2525 

Ala Arg Ser His Ser Glu Ser Pro Ser Arg Leu Pro lie Asn Arg Ser 
2530 2535 2540 

Gly Thr T r p Lys Arg Glu His Ser Lys His Ser Scr Ser Leu Pro Arg 
2545 2550 2555 2560 

Val Ser Thr Tip Arg Arg Thr Gly Ser Ser Ser Ser lie Leu Ser Ala 
2565 2570 2575 

Sex Ser Glu Ser Ser Glu Lys Ala Lys Ser Glu Asp Glu Lys His Val 
2580 2585 2590 

Asn Ser lie Ser Gly Thr Lys Gin Ser Lys Glu Asn Gin Val Ser Ala 
2595 2600 2605 

Lys Gly Thr Trp Arg Lys lie Lys Glu Asn Glu Phe Ser Pro Thr Asn 
2610 2615 2620 

Ser Thr Ser Gin Thr Val Ser Ser Gly Ala Thr Asn Gly Ala Glu Ser 
2625 2630 2635 2640 

Lys Thr Leu lie Tyr Gin Met Ala Pro Ala Val Ser Lys Thr Glu Asp 
2645 2650 2655 

Val Trp Val Arg lie Glu Asp Cys Pro lie Asn Asn Pro Arg Ser Gly 
2660 2665 2670 

Arg Ser Pro Thr Gly Asn Thr Pro Pro Val lie Asp Ser Val Ser Glu 
2675 2680 2685 

Lys Ala Asn Pro Asn lie Lys Asp Ser Lys Asp Asn Gin Ala Lys Gin 
2690 2695 2700 

Asn Val Gly Asn Gly Ser Val Pro Met Arg Thr Val Gly Leu Glu Asn 
2705 2710 2715 2720 

Arg Leu Thr Ser Phe lie Gin Val Asp Ala Pro Asp Gin Lys Gly Thr 



Glu lie Lys Pro Gly Gin Asn Asn Pro Val Pro Val Ser Glu 



Thr Asn 
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2740 2745 2750 

Glu Scr Pro lie Val Glu Arg Thr Pro Phc Ser Set Ser Sti Sci Sei 
2755 2760 2765 

Lys His Ser Ser Pro Ser Gly Thr Val Ala Ala Arg Val Thr Pro Phe 
2770 2775 2 7 8 0 

Asn Tyr Asa Pro Ser Pro Arg Lys Ser Ser Ala Asp Sei Thr Scr Ala 
2785 2790 2795 2800 

Arg Pro Ser Gin lie Pro Thr Pro Val Asn Asn Asn Thx Lys Lys Arg 
2805 2810 2815 

Asp Ser Lys Thr Asp Ser Thr Glu Ser Ser Gly Thr Gin Ser Pro Lys 
2820 2825 2830 

Arg His Ser Gly Ser Tyr Leu Val Thr Ser Val 
2835 2840 



( 2 ) INFORMATION FOR SEQ ID NOS: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 3172 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: double 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDN A 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( v i i ) IMMEDIATE SOURCE: 

( B ) CLONE: DP1(TB2) 

( i x ) FEATURE: 

( A ) NAME/KEY: CDS 
( B ) LOCATION: 1-630 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOS: 

GCA GTC GCC OCT CCA GTC TAT CCG OCA CTA GGA AC A GCC CCG GGN GGC 48 
Ala Val Ala Ala Pro Val Tyr Pro Ala Leu Gly Thr Ala Pro Gly Gly 
1 5 10 15 

GAG ACG GTC CCC GCC ATG TCT GCG GCC AT G AGO GAG AGO TTC G AC CGG 96 
Glu Thr Val Pro Ala Met Ser Ala Ala Met Arg Glu Arg Phe Asp Arg 
2 0 2 5 3 0 

TTC CTG CAC GAG A AG AAC TGC ATG ACT G AC CTT CTG GCC AAG CTC GAG 144 
Phe Leu His Glu Lys Asa Cys Met Thr Asp Leu Leu Ala Lys Leu Glu 
3 5 4 0 4 5 

GCC AAA ACC GGC GTG AAC AGG AG C TTC ATC GCT CTT GGT GTC AT C GGA 192 
Ala Lys Thr Gly Val Asn Arg Ser Phe lie Ala Leu Gly Val lie Gly 
5 0 5 5 6 0 

CTG GTG GCC TTG TAC CTG GTG TTC GGT TAT GGA GCC TCT CTC CTC TGC 240 
Leu Val Ala Leu Tyr Leu Val Phe Gly Tyr Gly Ala Ser Leu Leu Cys 
65 70 75 80 

AAC CTG ATA GGA TTT GGC TAC CCA GCC TAC ATC TCA ATT AAA GCT ATA 288 
Asn Leu lie Gly Phe Gly Tyi Pro Ala Tyr lie Ser lie Lys Ala lie 
8 5 9 0 9 5 

GAG AGT CCC AAC AAA G A A GAT GAT ACC C AG TGG CTG ACC TAC TOG GTA 336 
Glu Ser Pro Asn Lys Glu Asp Asp Thr Gin Tip Leu Thr Tyr Trp Val 
10 0 10 5 110 

GTG TAT GGT GTG TTC AGC ATT GCT G A A TTC TTC TCT GAT ATC TTC CTG 384 
Val Tyi Gly Val Phe Ser lie Ala Glu Phe Phe Sei Asp lie Phe Leu 
115 12 0 12 5 

TCA TGG TTC CCC TTC TAC TAC ATG CTG AAG TGT GGC TTC CTG TTG TGG 432 
Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lys Cys Gly Phe Leu Leu Trp 
130 135 140 

TGC ATG GCC CCG AGC CCT TCT AAT GGG GCT GAA CTG CTC TAC AAG CGC 480 
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Cys Met Ala Pro Ser Pro Sei Asn Gly Ala Gin Leu Leu Tyr Lys Arg 
145 150 155 160 

ATC AT C CGT CCT TTC TTC CTG A AG CAC GAG TCC CAG AT G G AC AGT GTG 528 

lie lie Arg Pro Pie Phe Leu Lys His Glu Ser Gin Met Asp Ser Val 
16 5 17 0 17 5 

GTC A AG G AC CTT AAA GAC A AG TCC AAA GAG ACT GCA GAT GCC ATC ACT 576 

Val Lys Asp Leu Lys Asp Lys Ser Lys Glu Thr Ala Asp Ala lie Thr 
18 0 18 5 19 0 

AAA G A A GCG A AG AAA GCT ACC GTG A A T TTA CTG GGT GAA G A A A AG A AG 624 

Lys Glu Ala Lys Lys Ala Thr Val Asn Leu Leu Gly Glu Glu Lys Lys 
195 200 205 

AG C ACC T A A AC C Ap AC TAAACCAGAC TGGATGGAAA CTTCCTGCCC TCTCTGTACC 680 
Ser Thr 
2 1 0 

TTCCTACTGG AGCTTGATGT TATATTAGGG ACTGTGGTAT AATTATTTTA ATAATGTTGC 740 

CTTGGAAACA TTTTTGAGAT ATTAAAGATT GGAATGTGTT GTAAGTTTCT TTGCTTACTT 800 

TTACTGTCTA TATATATAGG GAGCACTTTA A AC T T A AT GC AGTGGGCAGT GTCCACGTTT 860 

TTGGAAAATG TATTTTGCCT CTGGGTAGGA AAAGATGTAT GTTGCTATCC TGCAGGAAAT 920 

ATAAACTTAA AATAAAATTA TATACCCCAC AGGCTGTGTA CTTTACTGGG CTCTCCCTGC 980 

ACGSATTTTC TC T GT AG TTA CATTTAGGRT AATCTTTATG GTTCTACTTC CTRTAATGTA 1040 

CAATTTTATA TAATTCNGRA ATGTTTTTAA TGTATTTGTG CAC ATGT AC A TATGGAAATG 1100 

TTACTGTCTG ACTACANCAT GC A T CAT GCT CATGGGGAGG GAGCAGGGGA AGGTTGTATG 1160 

TGTCATTTAT A ACT TC TG T A CAGTAAGACC ACCTGCCAAA AGCTGGAGGA ACCATTGTGC 1220 

TGGTGTGGTC TACTAAATAA TACTTTAGGA AATACGTGAT TAATATGCAA GTGAACAAAG 1280 

TGAGAAATGA AATCGAATGG AG AT T GG C C T GGTTGTTTCC GTAGTATATG GCATATGAAT 1340 

ACCAGGATAG CTTTATAAAG CAGTTAGTTA GTTAGTTACT CACTCTAGTG ATAAATCGGG 1400 

AAATTTACAC ACACACACAC ACACACACAC ACACACACAC ACACACACAC ACACACACAG 1460 

AGTACCCTGT AACTCTCAAT TCCCTGAAAA ACTAGTAATA CTGTCTTATC TGCTATAAAC 1520 

TTTACATATT TGTCTATTGT CAAGATGCTA CANTGGAMNC CATTTCTGGT TTTATCTTCA 1580 

NAGSGGAGAN ACATGTTGAT TTAGTCTTCT TTCCCAATCT TCTTTTTTAA MCCAGTTTNA 1640 

GGMNCTTCTG RAGATTTGYC CACCTCTGAT TACATGTATG TTCTYGTTTG TATCATKAGC 1700 

AACAACATGC TAATGRCGAC ACCTAGCTCT RAGMGCAATT CTGGGAGANT GARAGGNWGT 1760 

AT A R AGTMNC CCATAATCTG CTTGGCAATA GTTAAGTCAA TCTATCTTCA GTTTTTCTCT 1820 

GGCCTTTAAG GTCAAACACA AGAGGCTTCC CTAGTTTACA AGTCAGAGTC ACTTGTAGTC 1880 

CATTTAAATG CCCTCATCCG TATTCTTTGT GTTGATAAGC TGCACAKGAC TACATAGTAA 1940 

GTACAGANCA GTAAAGTTAA NNCGGATGTC TCCATTGATC TGCCAANTCG NTATAGAGAG 2000 

CAATTTGTCT GGACTAGAAA ATCTGAGTTT TACACCATAC TGTTAAGAGT CCTTTTGAAT 2060 

TAAACTAGAC TAAAACAAGT GTATAACTAA ACTAACAAGA TTAAATATCC AGCCAGTACA 2 120 

GTATTTTTTA AGGCAAATAA AGATGATTAG CTCACCTTGA GNTAACAATC AGGTAAGATC 2 180 

ATNACAATGT CTCATGATGT NAANAATATT AAAGATATCA ATACTAAGTG ACAGTATCAC 2240 

NNCTAATATA ATATGGATCA GAGCATTTAT TTTGGGGAGG AAAACAGTGG T GA T T AC C GG 2 3 0 0 

CATTTTATTA AACTTAAAAC TTTGTAGAAA GCAAACAAAA TTGTTCTT GG GAGAAAATCA 2 3 6 0 

ACTTTTAGAT TAAAAAAATT TTAAGTAWCT AGGAGTATTT AAATCCTTTT CCCATAAATA 2420 

AAAGTACAGT TTTCTTGGTG GCAGAATGAA AATCAGCAAC NTCTAGCATA TAG A C T A T A T 2 4 8 0 

AATCAGATTG ACAGCATATA GAATAT AT TA TCAGACAAGA TGAGGAGGTA C AAA AG T T AC 2 5 4 0 
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T AT T GCT CAT AATGACTTAC AGGCTAAAAN TAGNTNTAAA ATACTATATT AAATTCTGAA 2 6 0 0 

TGCAATTTTT TTTTGTTCCC TTGAGACCAA AATTTAAGTT AACTGTTGCT GGCAGTCTAA 2660 

GTGTAAATGT TAACAGCAGG AGAAGT TAAG AATTGAGCAG TTCTGTTGCA TGATTTCCCA 2 7 2 0 

A AT G A A AT A C TGCCTTGGCT AGAGTTTGAA AAACTAATTG AGCCTGTGCC TGGCTAGAAA 2 7 8 0 

AC A AGCGT T T ATTTGAATGT GAATAGTGTT TCAAAGGTAT GTAGTTACAG AATTCCTACC 2 8 4 0 

AAACAGCTTA AATTCTTCAA GAAAGAATTC CTGCAGC AGT TATTCCCTTA CCTGAAGGCT 2900 

TCAATCATTT GGATCAACAA CTGCTACTCT CGGGAAGACT CCTCTACTCA CAGCTGAAGA 2960 

AAATGAGCAC ACCCTTCACA CTGTTATCAC CTATCCTGAA GATGTGATAC ACTGAATGGA 3020 

AATAAATAGA TGTAAATAAA A T TGAGWT C T CATTTAAAAA AAACCATGTG CCCAATGGGA 3 0 8 0 

AAATGACCTC ATGTTGTGGT TTAAACAGCA ACTGCACCCA CTAGCACAGC CCATTGAGCT 3140 

ANCCTATATA TACATCTCTG TCAGTGCCCC TC 3 172 

( 2 ) INFORMATION FOR SEQ ID NO:4: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 210 amino adds 
( B ) TYPE: amino acid 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: protein 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Ala Val Ala Ala Pro Val Tyr Pro Ala Leu Gly Thr Ala Pro Gly Gly 
1 5 10 15 

Glu Thr Val Pro Ala Met Sei Ala Ala Met Arg Glu Arg Phe Asp Arg 
2 0 2 5 3 0 

Phe Leu His Gin Lys Asn Cy s Met Thr Asp Leu Leu Ala Lys Leu Glu 
3 5 4 0 4 5 

Ala Lys Thr Gly Val Asn Arg Ser Phe lie Ala Leu Gly Val lie Gly 

5 0 5 5 6 0 

Leu Val Ala Leu Tyr Leu Val Phe Gly Tyr Gly Ala Ser Leu Leu Cys 
65 70 75 80 

Asn Leu lie Gly Phe Gly Tyr Pro Ala Tyr lie Ser lie Lys Ala lie 
8 5 9 0 9 5 

Glu Ser Pro Asn Lys Glu Asp Asp Thr Gin Trp Leu Thr Tyr Trp Val 
10 0 10 5 110 

Val Tyr Gly Val Phe Ser lie Ala Glu Phe Phe Ser Asp lie Phe Leu 
115 12 0 12 5 

Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lys Cys Gly Phe Leu Leu Trp 
13 0 13 5 14 0 

Cys Met Ala Pro Ser Pro Ser Asn Gly Ala Glu Leu Leu Tyr Lys Arg 
145 150 155 160 

lie lie Arg Pro Phe Phe Leu Lys His Glu Ser Gin Met Asp Ser Val 
1 6 5 1 7 0 1 7 5 

Val Lys Asp Leu Lys Asp Lys Ser Lys Glu Thr Ala Asp Ala lie Thr 
18 0 18 5 19 0 

Lys Glu Ala Lys Lys Ala Thr Val Asn Leu Leu Gly Glu Glu Lys Lys 
195 200 205 

Ser Thr 
2 1 0 

( 2 ) INFORMATION FOR SEQ ID NOS: 
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( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 434 amino adds 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

{ i i ) MOLECULE TYPE: protein 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( v i i ) IMMEDIATE SOURCE: 
( B ) CLONE: TBI 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:5: 



Va 1 Ala Pro 
1 



Va 1 



Va 1 V a 1 G 1 y 
5 



Ser Gly At g Ala 
1 0 



P r o 



At g His Pro Ala 
1 5 



Pro Ala Ala 



Tyr Arg Gly 
3 5 



Me t 
2 0 



G 1 y 



His Pro Arg 
Ala Arg Asp 



Arg Pro Asp Gly Phe 
2 5 

Glu Gin Gly Phe Gly 
4 0 



Asp Gly Leu Gly 
3 0 

Gly Ala Phe Pro 
4 5 



Ala Arg Ser 

5 0 



Ser Thr Gly 

5 5 



Ser Asp Leu Gly His Trp Val Thr Thr 
6 0 



Pro Pro Asp 
6 5 



I 1 e 



Pro Pro Tyr Gly 



Pro Gly Ser 
7 0 

Val Pro Thr 
S 5 



Arg Asn Leu 



Thr Ser 



Thr 
9 0 



His Trp Gly Glu Lys Ser 

7 5 8 0 

Pro Tyr Glu Gly Pro Thr 
9 5 



Glu Glu Pro 



Phe 

1 0 0 



Ser Ser Gly Gly Gly Gly Ser Val 
1 0 5 



Gin Gly Gin Scr 
1 1 0 



Ser Glu Gin 
1 1 5 



Leu Phe Thr 
1 3 0 



Leu Asn Arg Phe 



G I u 



Asn Val Leu 
13 5 



A 1 a 
1 2 0 



Gly Phe 



G 1 y 

Ala His Pro Cys 



I 1 e 
1 4 0 



Gly Leu Ala Ser 
1 2 5 

Val Leu Arg Arg 



Gin Cys Gin 
1 45 



Va 1 



Asn Tyr His 
1 5 0 



Ala Gin His 



Tyr 
1 5 5 



H i s 



Leu Thr Pro Phe 
1 6 0 



Th i Val lie As: 



lie Met Tyr 

1 6 5 



Ser Phe Asn Lys Thr 
17 0 



Gin Gly Pro Arg 
1 7 5 



Ala Leu Trp Lys 
1 8 0 



Gly Met Gly Ser 



Th r 

1 S 5 



Phe lie Val 



Gin Gly Val Thr 
1 9 0 



Leu Gly Ala Glu 
1 9 5 



Gly lie lie 



Ser 

2 0 0 



Glu Phe Thr Pro 



Leu Pro Arg Glu 

2 0 5 



Val Leu His Lys 
2 1 0 



Trp Ser Pro 
2 1 5 



Lys Gin lie Gly Glu His Leu Leu Leu 
2 2 0 



Lys Ser Leu Thr 
2 2 5 



Tyr Val Val 

2 3 0 



Ala Met Pro 



Phe 

2 3 5 



Tyr 



Ser Ala Ser Leu 
2 40 



lie Glu Thr Val 



Gin Ser Glu 
2 4 5 



lie lie 



Ar g 

2 5 0 



Asp Asn 



Thr Gly lie Leu 
2 5 5 



Glu Cys Val Lys 
2 6 0 



Glu Gly lie Gly Arg Val lie Gly 
2 6 5 



Met Gly Val Pro 
2 7 0 



His Ser Lys Arg 
2 7 5 



Leu Leu Pro 



Leu 
2 S 0 



Leu Ser Leu lie 



Phe Pro Thr Val 
2 8 5 



Leu His Gly Val 
2 9 0 



Leu His Tyr 

2 9 5 



lie lie Ser Ser 



V a I 

3 0 0 



lie Gin Lys Phe 



Val Leu Leu lie 

3 0 5 



Leu Lys Arg 
3 10 



Lys Thr Tyr Asn 
3 1 5 



His Leu Ala Glu 
3 2 0 



Ser Thr Sei Pro 



Val Gin Ser 

3 2 5 



Met Leu Asp Ala 

3 3 0 



Tyr 



Phe Pro Glu Leu 

3 3 5 



lie Ala Asn Phe Ala Ala Ser Leu Cys Ser Asp Val lie Leu Tyr Pro 
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Leu Glu Thr Val Leu His Arg Leu His lie Gla Gly Thr Aig Thr lie 



3 5 5 



3 6 0 



3 6 5 



lie Asp As li Thr Asp Leu Gly Tyr Glu Val Leu Pro lie Asn Thr Gin 

370 375 380 

Tyr Glu Gly Met Arg Asp Cys lie Asn Thr lie Arg Gla Glu Glu Gly 

385 390 395 400 

Val Phe Gly Phe Tyr Lys Gly Pbe Gly Ala Val lie lie Gin Tyr Thr 

4 0 5 4 10 4 15 



Leu His Ala Ala Val Leu Gin lie Thr Lys lie lie Tyr Ser Thr Leu 
420 425 430 



( 2 ) INFORMATION FOR SEQ ID NOrf: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 185 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D > TOPOLOGY: linear 

( i i ) MOLECULE TYPE: protein 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( v i i ) IMMEDIATE SOURCE: 

( B ) CLONE: YS-39<TB2) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO*: 

Glu Leu Arg Arg Phe Asp Arg Phe Leu His Glu Lys Asn Cys Met Thr 
1 5 10 15 

Asp Leu Leu Ala Lys Leu Glu Ala Lys Thr Gly Val Asn Arg Ser Phe 
2 0 2 5 3 0 

lie Ala Leu Gly Val lie Gly Leu Val Ala Leu Tyr Leu Val Phe Gly 
3 5 4 0 4 5 

Tyr Gly Ala Ser Leu Leu Cys Asn Leu lie Gly Phe Gly Tyr Pro Ala 
5 0 5 5 6 0 

Tyr lie Ser lie Lys Ala lie Glu Ser Pro Asn Lys Glu Asp Asp Thr 
65 70 75 80 

Gin Trp Leu Thr Tyr Trp Val Val Tyr Gly Val Phe Ser lie Ala Glu 
8 5 9 0 9 5 

Phe Phe Ser Asp lie Phe Leu Ser Trp Phe Pro Phe Tyr Tyr lie Leu 
10 0 10 5 110 

Lys Cys Gly Phe Leu Leu Trp Cys Met Ala Pro Ser Pro Ser Asn Gly 
115 12 0 12 5 

Ala Glu Leu Leu Tyr Lys Arg lie lie Arg Pro Phe Phe Leu Lys His 
13 0 13 5 14 0 

Glu Ser Gin Met Asp Ser Val Val Lys Asp Leu Lys Asp Lys Ala Lys 
145 150 155 160 

Glu Thr Ala Asp Ala lie Thr Lys Glu Ala Lys Lys Ala Thr Val Asn 
16 5 17 0 17 5 

Leu Leu Gly Glu Glu Lys Lys Ser Thr 



< 2 ) INFORMATION FOR SEQ ID NO:7: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 2S42 amino acids 
( B ) TYPE: amino acid 
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( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE; protein 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( v i i ) IMMEDIATE SOURCE: 
( B ) CLONE: APC 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:7: 



Met Ala Ala Ala Ser Tyr Asp Gin Leu Leu 



Lys Gin Val Glu Ala Leu 
1 5 



Lys Met Glu Asn Ser Asn Leu Arg Gin Glu 

2 0 2 5 



Leu Glu Asp Asn Ser Asn 

3 0 



His Leu Thr Lys LeuGlu Thi Glu Ala Ser 
3 5 4 0 



Asn Met Lys Glu Val Leu 
45 



Lys Gin Leu Gin Gly Ser lie Glu Asp Glu 
5 0 5 5 



Ala Met Ala Ser Ser Gly 
6 0 



Gin lie Asp Leu Leu Glu Arg Leu Lys Glu Leu Asn Leu Asp Ser Ser 



6 5 



7 0 



7 5 



8 0 



Asn Phe Pro Gly Val Lys Leu Aig Ser Lys 



8 5 



9 0 



Met Ser Leu Arg Ser Tyr 
9 5 



Gly Ser Arg Glu Gly Ser Val Ser Ser Ai | 



Ser Gly Glu Cys Ser Pro 
1 1 0 



Val Pro Met Gly Ser Phe Pro Arg Arg Gly 



Phe Val Asn Gly Ser Arg 
1 2 5 



Glu Ser Thr Gly Tyr Leu Glu Glu Leu Glu 



Lys Glu Arg Ser Leu Leu 
1 4 0 



Leu Ala Asp Leu Asp Lys Glu Glu Lys Glu Lys Asp Trp Tyr Tyr Ala 
145 150 155 160 



Gin Leu Gin Asn Leu Thr Lys Arg lie Asp 



1 6 5 



17 0 



Ser Leu Leu Thr Gin Asn 
1 7 5 



Phe Ser Leu Gin Thr Asp Met Thr Arg Arg 



Gin Leu Glu Tyr Glu Ala 
1 9 0 



Arg Gin lie Arg Val Ala Met Glu Glu Gin 

19 5 2 0 0 



Leu Gly Thr Cys Gin Asp 
2 0 5 



Met Glu Lys Arg Ala Gin Arg Arg lie Ala 



2 1 0 



2 1 5 



Arg lie Gin Gin lie Glu 

2 2 0 



Lys Asp lie Leu Arg lie Arg Gin Leu Leu Gin Ser Gin Ala Thr Glu 



2 3 0 



2 3 5 



2 4 0 



Ala Glu Arg Sei Ser Gin Asn Lys His Glu 
2 45 2 5 0 



Thr Gly Ser His Asp Ala 
2 5 5 



Olu Arg Gin Asn Glu Gly Gin Gly Val Gly 

2 6 0 2 6 5 



Glu lie Asn Met Ala Thr 

2 7 0 



Ser Gly Asn Gly Gin Gly Ser Thr Thr Arg 



Met Asp His Glu Thr Ala 
2 8 5 



Ser Val Leu Ser Ser Ser Ser Thi His Sci 



2 9 0 



2 9 5 



Ala Pro Arg Arg Leu Thr 

3 0 0 



Ser His Leu Gly Thr Lys Val Glu Met Val Tyr Ser Leu Leu Ser Met 



3 0 5 



3 1 5 



3 2 0 



Leu Gly Thr His Asp Lys Asp Asp Met Ser 



Arg Thr Leo Leu Ala Met 

3 3 5 



Ser Ser Ser Gin Asp Ser Cys lie Ser Met 



Arg Gin Ser Gly Cys Leu 
3 5 0 



Pro Leu Leu lie Gin 



Leu Leu His Gly Asn Asp Lys Asp Ser Val Leu 
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355 360 365 

Leu Gly As n Ser Aig Gly Str Lys Glu Ala Arg Ala Arg Ala Ser Ala 
370 375 380 

Ala Leu His As n lie lie His Ser Gin Pro Asp Asp Lys Arg Gly Arg 
385 390 395 400 

Arg Glu lie Arg Val Leu His Leu Leu Glu Gin lie Arg Ala Tyr Cys 
4 0 5 4 10 4 15 

Glu Thr Cys Trp Glu Trp Gin Glu Ala His Glu Pro Gly Met Asp Gin 
420 425 430 

Asp Lys Asn Pro Met Pro Ala Pro Val Glu His Gin lie Cys Pro Ala 
435 440 445 

Val Cys Val Leu Met Lys Leu Ser Phe Asp Glu Glu His Arg His Ala 
450 455 460 

Met As n Glu Leu Gly Gly Leu Gin Ala lie Ala Glu Leu Leu Gin Val 
465 470 475 480 

Asp Cys Glu Met Tyr Gly Leu Thr Asn Asp His Tyr Ser lie Thr Leu 
485 490 495 

Aig Arg Tyr Ala Gly Met Ala Leu Thr Asn Leu Thr Phe Gly Asp Val 
500 505 510 

Ala Asn Lys Ala Thr Leu Cys Ser Met Lys Gly Cys Met Arg Ala Leu 
515 520 525 

Val Ala Gin Leu Lys Ser Glu Ser Glu Asp Leu Gin Gin Val lie Ala 
530 535 540 

Ser Val Leu Arg Asn Leu Ser Trp Arg Ala Asp Val Asn Ser Lys Lys 
545 550 555 560 

Thr Leu Arg Glu Val Gly Ser Val Lys Ala Leu Met Glu Cys Ala Leu 
565 570 575 

Glu Val Lys Lys Glu Ser Thr Leu Lys Ser Val Leu Ser Ala Leu Trp 
580 585 590 

Asn Leu Ser Ala His Cys Thr Glu Asn Lys Ala Asp lie Cys Ala Val 
595 600 605 

Asp Gly Ala Leu Ala Phe Leu Val Gly Thr Leu Thr Tyr Arg Ser Gin 
610 615 620 

Thr Asn Thr Leu Ala lie lie Glu Ser Gly Gly Gly lie Leu Arg Asn 
625 630 635 640 

Val Ser Ser Leu lie Ala Thr Asn Glu Asp His Arg Gin lie Leu Arg 
645 650 655 

Glu Asn Asn Cys Leu Gin Thr Leu Leu Gin His Leu Lys Ser His Ser 
660 665 670 

Leu Thr lie Val Ser Asn Ala Cys Gly Thr Leu Trp Asn Leu Ser Ala 
675 680 685 

Arg Asn Pro Lys Asp Gin Glu Ala Leu Trp Asp Met Gly Ala Val Ser 
690 695 700 

Met Leu Lys Asn Leu lie His Ser Lys His Lys Met lie Ala Met Gly 
705 710 715 720 

Ser Ala Ala Ala Leu Arg Asn Leu Met Ala Asn Arg Pro Ala Lys Tyr 

725 730 735 

Lys Asp Ala Asn lie Met Ser Pro Gly Ser Ser Leu Pro Ser Leu His 
740 745 750 

Val Arg Lys Gin Lys Ala Leu Glu Ala Glu Leu Asp Ala Gin His Leu 
755 760 765 

Ser Glu Thr Phe Asp Asn lie Asp Asn Leu Ser Pro Lys Ala Ser His 
770 775 780 
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Arg Sei Lys Gin Ai g His Lys Gin Ser Leu Tyi Gly Asp Tyr Val Phe 
785 790 795 800 

Asp Thr Ass Arg His Asp Asp Ass Arg Ser Asp Asa Phe Asn Thr Gly 
8 0 5 8 10 8 15 

Asn Met Thi Val Leu Ser Pro Tyr Leu Asn Thr Thr Val Leu Pro Ser 
820 825 830 

Ser Ser Ser Sei Arg Gly Ser Leu Asp Ser Ser Arg Ser Glu Lys Asp 
835 840 845 

Arg Ser Leu Glu Arg Glu Arg Gly lie Gly Leu Gly Asn Tyr His Pro 
850 855 860 

Ala Thr Glu Asn Pro Gly Thr Ser Ser Lys Arg Gly Leu Gin lie Ser 
865 870 875 880 

Thr Thr Ala Ala Gin lie Ala Lys Val Met Glu Glu Val Ser Ala lie 
885 890 895 

His Thr Ser Gin Glu Asp Arg Ser Ser Gly Ser Thr Thr Glu Leu His 
900 905 910 

Cys Val Thr Asp Glu Arg Asn Ala Leu Arg Arg Ser Ser Ala Ala His 
915 920 925 

Thr His Ser Asn Thr Tyr Asn Phe Thr Lys Ser Glu Asn Ser Asn Arg 
930 935 940 

Thr Cys Ser Met Pro Tyr Ala Lys Leu Glu Tyr Lys Arg Ser Ser Asn 
945 950 955 960 

Asp Ser Leu Asn Ser Val Ser Ser Ser Asp Gly Tyr Gly Lys Arg Gly 
965 970 975 

Gin Met Lys Pro Ser lie Glu Ser Tyr Ser Glu Asp Asp Glu Ser Lys 
980 985 990 

Phe Cys Ser Tyr Gly Gin Tyr Pro Ala Asp Leu Ala His Lys lie His 
995 1000 1005 

Ser Ala Asn His Met Asp Asp Asn Asp Gly Glu Leu Asp Thr Pro lie 
1010 1015 1020 

Asn Tyr Ser Leu Lys Tyr Ser Asp Glu Gin Leu Asn Ser Gly Arg Gin 
1025 1030 1035 1040 

Ser Pro Ser Gin Asn Glu Arg Trp Ala Arg Pro Lys His lie lie Glu 
1045 1050 1055 

Asp Glu lie Lys Gin Ser Glu Gin Arg Gin Ser Arg Asn Gin Ser Thr 
1060 1065 1070 

Thr Tyr Pro Val Tyr Thr Glu Ser Thr Asp Asp Lys His Leu Lys Phe 
1075 1080 1085 

Gin Pro His Phe Gly Gin Gin Glu Cys Val Ser Pro Tyr Arg Ser Arg 
1090 1095 1100 

Gly Ala Asn Gly Ser Glu Thr Asn Arg Val Gly Ser Asn His Gly lie 
1105 1110 1115 112C 

Asn Gin Asn Val Ser Gin Ser Leu Cys Gin Glu Asp Asp Tyr Glu Asp 
1125 1130 1135 

Asp Lys Pro Thr Asn Tyr Ser Glu Arg Tyr Ser Glu Glu Glu Gin His 
1140 1145 1150 

Glu Glu Glu Glu Arg Pro Thr Asn Tyr Ser lie Lys Tyr Asn Glu Glu 
1155 1160 1165 

Lys Arg His Val Asp Gin Pro lie Asp Tyr Ser Leu Lys Tyr Ala Thr 
1170 1175 1180 

Asp lie Pro Ser Ser Gin Lys Gin Ser Phe Ser Phe Ser Lys Ser Ser 
1185 1190 1195 120( 

Ser Gly Gin Ser Ser Lys Thr Glu His Met Ser Ser Ser Ser Glu Asn 
1205 1210 1215 
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Thr Ser Thr Pro Ser Sei Asn Ala Ly s Arg Gin Asn Gin Leu His Pro 
1220 1225 1230 

Scr Ser Ala Gin Ser Arg Ser Gly Gin Pro Gin Lys Ala Ala Thr Cys 
1235 1240 1245 

Lys Val Sci Ser lie Asn Gin Glu Thi lie Gin Thr Tyr Cys Val Glu 
1250 1255 1260 

Asp Thr Pro lie Cys Phe Sei Arg Cys Ser Ser Leu Ser Ser Leu Ser 
1265 1270 1275 1280 

Ser Ala Glu Asp Glu lie Gly Cys Asn Gin Thr Thr Gin Glu Ala Asp 
1285 1290 1295 

Ser Ala Asn Thr Leu Gin lie Ala Glu lie Lys Glu Lys lie Gly Thr 
1300 1305 1310 

Arg Ser Ala Glu Asp Pro Val Ser Glu Val Pro Ala Val Ser Gin His 
1315 1320 1325 

Pro Arg Thr Lys Ser Ser Arg Leu Gin Gly Ser Ser Leu Ser Ser Glu 
1330 1335 1340 

Ser Ala Arg His Lys Ala Val Glu Phe Ser Ser Gly Ala Lys Scr Pro 
1345 1350 1355 1360 

Ser Lys Ser Gly Ala Gin Thr Pro Lys Ser Pro Pro Glu His Tyr Val 
1365 1370 1375 

Gin Glu Thr Pro Leu Met Phe Ser Arg Cys Thr Ser Val Ser Ser Leu 
1380 1385 1390 

Asp Ser Phe Glu Ser Arg Ser lie Ala Ser Ser Val Gin Ser Glu Pro 
1395 1400 1405 

Cys Ser Gly Met Val Ser Gly lie lie Ser Pro Ser Asp Leu Pro Asp 
1410 1415 1420 

Ser Pro Gly Gin Thr Met Pro Pro Ser Arg Ser Lys Thr Pro Pro Pro 
1425 1430 1435 1440 

Pro Pro Gin Thr Ala Gin Thr Lys Arg Glu Val Pro Lys Asn Lys Ala 
1445 145 0 145 5 

Pro Thr Ala Glu Lys Arg Glu Ser Gly Pro Lys Gin Ala Ala Val Asn 
1460 1465 1470 

Ala Ala Val Gin Arg Val Gin Val Leu Pro Asp Ala Asp Thr Leu Leu 
1475 1480 1485 

His Phe Ala Thr Glu Ser Thr Pro Asp Gly Phe Ser Cys Ser Ser Ser 
1490 1495 1500 

Leu Ser Ala Leu Ser Leu Asp Glu Pro Phe lie Gin Lys Asp Val Glu 
1505 1510 1515 1520 

Leu Arg lie Met Pro Pro Val Gin Glu Asn Asp Asn Gly Asn Glu Thr 
1525 1530 1535 

Glu Ser Glu Gin Pro Lys Glu Ser Asn Glu Asn Gin Glu Lys Glu Ala 
1540 1545 1550 

Glu Lys Thr lie Asp Ser Glu Lys Asp Leu Leu Asp Asp Ser Asp Asp 
1555 1560 1565 

Asp Asp lie Glu lie Leu Glu Glu Cys lie lie Ser Ala Met Pro Thr 
1570 1575 1580 

Lys Ser Ser Arg Lys Ala Lys Lys Pro Ala Gin Thr Ala Ser Lys Leu 
1585 1590 1595 1600 

Pro Pro Pro Val Ala Arg Lys Pro Ser Gin Leu Pro Val Tyr Lys Leu 
1605 1610 1615 

Leu Pro Ser Gin Asn Arg Leu Gin Pro Gin Lys His Val Ser Phe Thr 
1620 1625 1630 



Pro Gly Asp 



Asp Met Pro Arg Val Tyr Cys Val Glu Gly Thr Pro lie 
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1635 1640 1645 

Asn Phe Ser Thr Ala Thr Ser Leu Ser Asp Leu Thr lie Glu Ser Pro 
1650 1655 1660 

Pro Asn Glu Leu Ala Ala Gly Glu Gly Val Arg Gly Gly Ala Gin Ser 
1665 1670 1675 1680 

Gly Glu Phe Glu Ly s Arg Asp Thr lie Pro Thr Glu Gly Arg Ser Thr 
1685 1690 1695 

Asp Glu Ala Gin Gly Gly Lys Thr Ser Ser Val Thr lie Pro Glu Leu 
1700 1705 1710 

Asp Asp Asn Ly s Ala Glu Glu Gly Asp lie Leu Ala Glu Cys lie Asn 
17U 1720 1725 

Ser Ala Met Pro Lys Gly Lys Ser His Lys Pro Phe Arg Val Lys Lys 
1730 1735 1740 

lie Met Asp Gin Val Gin Gin Ala Ser Ala Ser Ser Ser Ala Pro Asn 
1745 1750 1755 1760 

Lys Asn Gin Leu Asp Gly Lys Lys Lys Lys Pro Thr Ser Pro Val Lys 
1765 1770 1775 

Pro lie Pro Gin Asn Thr Glu Tyr Arg Thr Arg Val Arg Lys Asn Ala 
1780 1785 1790 

Asp Ser Lys Asn Asn Leu Asn Ala Glu Arg Val Phe Ser Asp Asn Lys 
1795 1800 1805 

Asp Ser Lys Lys Gin Asn Leu Lys Asn Asn Ser Lys Asp Phe Asn Asp 
1810 1815 1820 

Lys Leu Pro Asn Asn Glu Asp Arg Val Arg Gly Ser Phe Ala Phe Asp 
1825 1830 1835 1840 

Ser Pro His His Tyr Thr Pro lie Glu Gly Thr Pro Tyr Cys Phe Ser 
1845 1850 1855 

Arg Asn Asp Ser Leu Ser Ser Leu Asp Phe Asp Asp Asp Asp Val Asp 
I860 1865 1870 

Leu Ser Arg Glu Lys Ala Glu Leu Arg Lys Ala Lys Glu Asn Lys Glu 
1875 1880 1885 

Ser Glu Ala Lys Val Thr Ser His Thr Glu Leu Thr Ser Asn Gin Gin 
1890 1895 1900 

Ser Ala Asn Lys Thr Gin Ala lie Ala Lys Gin Pro lie Asn Arg Gly 
1905 1910 1915 1920 

Gin Pro Lys Pro lie Leu Gin Lys Gin Ser Thr Phe Pro Gin Ser Set 
1925 1930 1935 

Lys Asp lie Pro Asp Arg Gly Ala Ala Thr Asp Glu Lys Leu Gin Asn 
1940 1945 1950 

Phe Ala lie Glu Asn Thr Pro Val Cys Phe Ser His Asn Ser Ser Leu 
1955 1960 1965 

Ser Ser Leu Ser Asp lie Asp Gin Glu Asn Asn Asn Lys Glu Asn Glu 
1970 1975 1980 

Pro lie Lys Glu Thr Glu Pro Pro Asp Ser Gin Gly Glu Pro Ser Lys 
1985 1990 1995 2000 

Pro Gin Ala Ser Gly Tyr Ala Pro Lys Ser Phe His Val Glu Asp Thr 
2005 2010 2015 

Pro Val Cys Phe Ser Arg Asa Ser Ser Leu Ser Ser Leu Ser lie Asp 
2020 2025 2030 

Ser Glu Asp Asp Leu Leu Gin Glu Cys lie Ser Ser Ala Met Pro Lys 
2035 2040 2045 

Lys Lys Lys Pro Ser Arg Leu Lys Gly Asp Asn Glu Lys His Ser Pro 
2050 2055 2060 
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Arg Asn Met Gly Gly lie Leu Gly Glu Asp Leu Thi Leu Asp Leu Lys 
2065 2070 2075 2080 

Asp lie Gin Arg Pro Asp Ser Glu His Gly Leu Ser Pro Asp Ser Glu 
2085 2090 2095 

Asn Pbe Asp Trp Lys Ala lie Gin Glu Gly Ala Asn Ser lie Val Ser 
2100 2105 2110 

Ser Leu His Gin Ala Ala Ala Ala Ala Cys Leu Ser Arg Gin. Ala Ser 
2115 2120 2125 

Ser Asp Ser Asp Ser lie Leu Ser Leu Lys Ser Gly lie Ser Leu Gly 
2130 2135 2140 

Ser Pro Phe His Leu Thr Pro Asp Gin Glu Glu Lys Pro Phe Thr Ser 
2145 2150 2155 2160 

Asn Lys Gly Pro Arg lie Leu Lys Pro Gly Glu Lys Ser Thr Leu Glu 
2165 2170 2175 

Thr Lys Lys lie Glu Ser Glu Ser Lys Gly lie Lys Gly Gly Lys Lys 
2180 2185 2190 

Val Tyr Lys Ser Leu lie Thr Gly Lys Val Arg Ser Asn Ser Glu lie 
2195 2200 2205 

Ser Gly Gin Met Lys Gin Pro Leu Gin Ala Asn Met Pro Ser lie Ser 
2210 2215 2220 

Arg Gly Arg Thr Met lie His lie Pro Gly Val Arg Asn Ser Ser Ser 
2225 2230 2235 2240 

Ser Thr Ser Pro Val Ser Lys Lys Gly Pro Pro Leu Lys Thr Pro Ala 
2245 2250 2255 

Ser Lys Ser Pro Ser Glu Gly Gin Thr Ala Thr Thr Ser Pro Arg Gly 
2260 2265 2270 

Ala Lys Pro Ser Val Lys Ser Glu Leu Ser Pro Val Ala Arg Gin Thr 
2275 2280 2285 

Ser Gin lie Gly Gly Ser Ser Lys Ala Pro Ser Arg Ser Gly Ser Arg 
2290 2295 2300 

Asp Ser Thr Pro Ser Arg Pro Ala Gin Gin Pro Leu Ser Arg Pro lie 
2305 2310 2315 2320 

Gin Ser Pro Gly Arg Asn Ser lie Ser Pro Gly Arg Asn Gly lie Ser 
2325 2330 2335 

Pro Pro Asn Lys Leu Ser Gin Leu Pro Arg Thr Ser Ser Pro Ser Thr 
2340 23 45 2350 

Ala Ser Thr Lys Ser Ser Gly Ser Gly Lys Met Ser Tyr Thr Ser Pro 
2355 2360 2365 

Gly Arg Gin Met Ser Gin Gin Asn Leu Thr Lys Gin Thr Gly Leu Ser 
2370 2375 2380 

Lys Asn Ala Ser Ser lie Pro Arg Ser Glu Ser Ala Ser Lys Gly Leu 
2385 2390 2395 2400 

Asn Gin Met Asn Asn Gly Asn Gly Ala Asn Lys Lys Val Glu Leu Ser 
2405 2410 2415 

Arg Met Ser Ser Thr Lys Ser Ser Gly Ser Glu Ser Asp Arg Ser Glu 
2420 2425 2430 

Arg Pro Val Leu Val Arg Gin Ser Thr Phe lie Lys Glu Ala Pro Ser 
2435 2440 2445 

Pro Thr Leu Arg Arg Lys Leu Glu Glu Ser Ala Ser Phe Glu Ser Leu 
2450 2455 2460 

Ser Pro Ser Ser Arg Pro Ala Ser Pro Thr Arg Ser Gin Ala Gin Thr 
2465 2470 2475 2480 

Pro Val Leu Ser Pro Ser Leu Pro Asp Met Ser Leu Ser Thr His Ser 
2485 2490 2495 
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Ser Val Gin Ala Gly Gly Tip Arg Lys Leu Pro Pro Asa Leu Ser Pro 



2 5 0 5 



2 5 10 



Thr lie Glu Tyi Asa Asp Gly Arg Pro Ala Lys Arg His Asp lie Ala 
2515 2520 2525 

Arg Ser His Ser Glu Ser Pro Ser Arg Leu Pro lie Asn Arg Ser Gly 
2530 2535 2540 

Thr Trp Lys Arg Glu His Ser Lys His Ser Ser Ser Leu Pro Arg Val 
2545 2550 2555 256( 

Ser Thr Trp Arg Arg Thr Gly Ser Ser Ser Ser lie Leu Ser Ala Ser 

2565 2570 2575 

Ser Glu Ser Ser Glu Lys Ala Lys Ser Glu Asp Glu Lys His Val Asn 
2580 2585 2590 



Ser lie Ser Gly Thr Lys Gin Ser Lys Glu Asn Gin Val Ser Ala Lys 



2 5 9 5 



2 6 0 0 



2 6 0 5 



Gly Thr Trp Arg Lys lie Lys Glu Asn Glu Phe Ser Pro Thr Asn Ser 



2 6 15 



2 6 2 0 



Thr Ser Gin Thr Val Ser Ser Gly Ala Thr Asn Gly Ala Glu Ser Lys 



2 6 2 5 



2 6 3 0 



2 6 4 0 



Thr Leu lie Tyr Gin Met Ala Pro Ala Val Ser Lys Thr Glu Asp Val 



2 6 45 



2 6 5 0 



Trp Val Arg lie Glu Asp Cys Pro lie Asn Asn Pro Arg Ser Gly Arg 



2 6 6 0 



2 6 6 5 



2 6 7 0 



Ser Pro Thr Gly Asn Thr Pro Pro Val lie Asp Ser Val Ser Glu Ly; 



2 6 7 5 



2 6 8 5 



Ala Asn Pro Asn lie Lys Asp Ser Lys Asp Asn Gin Ala Lys Gin Asn 
2690 2695 2700 



Val Gly Asn Gly Ser Val Pro Met Arg Thr Val Gly Leu Glu Asn Arg 



2 7 0 5 



2 7 15 



Leu Asn Ser Phe lie Gin Val Asp Ala Pro Asp Gin Lys Gly Thr Glu 



2 7 2 5 



27 3 0 



lie Lys Pro Gly Gin Asn Asn Pro Val Pro Val Ser Glu Thr Asn Glu 



2 7 4 0 



2 7 5 0 



Scr Ser lie Val Glu Arg Thr Pro Phe Ser Ser Ser Ser Ser Ser Lys 
2755 2760 2765 

His Ser Ser Pro Ser Gly Thr Val Ala Ala Arg Val Thr Pro Phe Asn 



Tyr Asn Pro Ser Pro Arg Lys Ser Ser Ala Asp Ser Thr Ser Ala Arg 



2 7 9 0 



2 8 0 0 



Pro Ser Gin lie Pro Thr Pro Val Asn Asn Asn Thr Lys Lys Arg Asp 
2805 2810 2815 

Ser Lys Thr Asp Ser Thr Glu Ser Ser Gly Thr Gin Ser Pro Lys Arg 



His Ser Gly Ser Tyr Leu Val Thr Ser Val 
2835 2840 



( 2 ) INFORMATION FOR SEQ ID NO: 8: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 31 amino adds 
( B ) TYPE: amino acid 
< C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( v i i ) IMMEDIATE SOURCE: 

( B ) CLONE: ral2(yeast) 
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( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Leu Thi Gly Ala Lys Gly Leu Gin Leu Arg Ala Leu Arg Arg lie Ala 
1 5 10 15 

Arg lie Glu Gin Gly Gly Thr Ala lie Ser Pro Thr Ser Pro Leu 
2 0 2 5 3 0 

( 2 ) INFORMATION FOR SEQ ID NOS: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 29 amino adds 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( v i i ) IMMEDIATE SOURCE: 

( B ) CLONE: nj3(mAChR) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

Leu Tyr Tip Arg lie Tyr Lys Glu Thr Glu Lys Arg Thr Lys Glu Leu 
1 5 10 15 

Ala Gly Leu Gin Ala Ser Gly Thr Glu Ala Glu Thr Glu 
2 0 2 5 

( 2 ) INFORMATION FOR SEQ ID NO: 10: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 29 amino acids 
( B ) TYPE: amino acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: peptide 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( v i i ) IMMEDIATE SOURCE: 
( B ) CLONE: MCC 

< x i ) SEQUENCE DESCRIPTION: SEQ ID NCfclO: 

Leu Tyr Pro Asn Leu Ala Glu Glu Arg Ser Arg Trp Glu Lys Glu Leu 
1 5 10 15 

Ala Gly Leu Arg Glu Glu Asn Glu Ser Leu Thr Ala Met 

2 0 2 5 



( 2 ) INFORMATION FOR SEQ ID NO:ll: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: Hnear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A > ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

GTATCAAGAC TGTGACTTTT AATTGTAGTT T AT CC AT T T T 40 



( 2 ) INFORMATION FOR SEQ ID NO:12: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: angle 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

TTTAGAATTT CATGTTAATA TATTGTGTTC TTTTTAACAG 



( 2 ) INFORMATION FOR SEQ lb NO:13: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i l ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

GTAGATTTTA AAAAGGTGTT TTAAAATAAT TTTTTAAGCT 



( 2 ) INFORMATION FOR SEQ ID NO:14: 

( i > SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOtl4: 

AAGCAATTGT TGTATAAAAA CTTGTTTCTA TTTTATTTAG 



( 2 )INJPORMATIONFORSEQIDNO:15: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

(xx) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GTAACTTTTC TTCATATAGT AAACATTGCC TTGTGTACTC 



( 2 ) INFORMATION FOR SEQ ID NO:16: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



( i i ) MOLECULE TYPE: cDNA 
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( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

NNNNNNNNNN NNNGTCCCTT TTTTTAAAAA AAAAAAATAG 

( 2 > INFORMATION FOR SEQ ID Nai7: 

( i ) SEQUENCE CHARACTERISTICS : 
( A ) LENGTH: 40 base pairs 
{ B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

{ v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:17: 

GTAAGTAACT TGGCAGTACA ACTTATTTGA AACTTTAATA 

( 2 ) INFORMATION FOR SEQ ID Nai8: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

ATACAAGATA T T G AT AC T T T TTTATTATTT GTGGTTTTAG 

< 2 ) INFORMATION FOR SEQ ID NO:19: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
< B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:19: 

GTAAGTTACT TGTTTCTAAG TGATAAAACA G Y GAAGAGCT 



( 2 ) INFORMATION FOR SEQ ID NO20: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 

< C ) STRANDEDNESS: single 

< D ) TOPOLOGY: lbcar 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO20: 



AATAAAAACA TAACTAATTA GGTTTCTTGT TTTATTTT AG 
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( 2 ) INFORMATION FOR SEQ ID NO:21: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

GT T AG T A A AT TSCCTTTTTT GTTTGTGGGT ATAAAAATAG 40 



( 2 ) INFORMATION FOR SEQ ID NO:22: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 



( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

ACCATTTTTG CATGTACTGA TGT T A ACT C C ATCTTAACAG 40 



( 2 ) INFORMATION FOR SEQ ID N023: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

< A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

GTAAATAAAT TATTTTATCA TATTTTTTAA AATTATTTAA 



< 2 ) INFORMATION FOR SEQ ID NG^4: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 64 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

C ATGATGT T A TCTGTATTTA CCTATAGTCT AAATTATACC AT C T A T A A TO TGCTTAATTT 60 

TTAG 64 



( 2 ) INFORMATION FOR SEQ ID N025: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 52 base pairs 
( B ) TYPE: nucleic acid 
{ C ) STRANDEDNESS: single 
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( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

GTAACAGAAG ATTACAAACC CTGGTCACTA ATGCCATGAC TACTTTGCTA AG 52 



( 2 ) INFORMATION FOR SEQ ID NO:26: 

( i ) SEQUENCE CHARACTERISTICS: 
< A ) LENGTH: 46 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N026: 

GGATATTAAA GTCGTAATTT TGTTTCTAAA CTCATTTGGC CCACAG 46 



( 2 ) INFORMATION FOR SEQ ID NO:27: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

< x i ) SEQUENCE DESCRIPTION: SEQ ID N027: 

GTATGTTCTC TATAGTGTAC ATCGTAGTGC ATGTTTCAAA 40 



< 2 ) INFORMATION FOR SEQ ID NO:28: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 56 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( t i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

CATC ATTGCT CTTCAAATAA CAAAGCATTA TGGTTTATGT TGATTTTATT TTTCAG 56 



( 2 ) INFORMATION FOR SEQ ID NOS9: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 43 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 
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GTAAGACAAA AATGTTTTTT A A TG A C A TAG ACAATTACTG GTG 43 

( 2 ) INFORMATION FOR SEQ ID NO30: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 30: 

TTAGATGATT GTCTTTTTCC TCTTGCCCTT TTTAAATTAG 40 

( 2 ) INFORMATION FOR SEQ ID N031: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 44 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO 31 : 

GTATGTTTTT ATAACATGTA TTTCTTAAGA TAGCTCAGGT AT GA 44 

( 2 ) INFORMATION FOR SEQ ID N032: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 54 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

< A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N032: 

GCTTGGCTTC AAGTTGNCTT TTTAATGATC CTCTATTCTG TATTTAATTT AC AG 54 

( 2 ) INFORMATION FOR SEQ ID N033: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 65 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO 33: 

GTACTATTTA GAATTTCACC TGTTTTTCTT TTTTCTCTTT TTCTTTGAGG CAGGGTCTCA 60 

CTCTG 65 



( 2 ) INFORMATION FOR SEQ ID N034: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 52 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

GCAACTAGTA TGATTTTATG TATAAATTAA TCTAAAATTG AT T A AT T TC C AG 52 



( 2 ) INFORMATION FOR SEQ ID NO£35: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 42 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
(D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

{ x i ) SEQUENCE DESCRIPTION: SEQ ID NO 35: 

GTACCTTTGA AAACATTT AG TACTATAATA TGAATTTCAT GT 42 



( 2 ) INFORMATION FOR SEQ ID NO'36: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 40 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

< A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N036: 

CCAACTCNAA TTAGATGACC CATATTCAGA AACTTACTAG 40 



( 2 ) INFORMATION FOR SEQ ID N037: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 54 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

{ i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO-37: 

GTATATATAG AGTTTTATAT TACTTTTAAA GTACAGAATT CATACTCTCA A A A A 54 



( 2 ) INFORMATION FOR SEQ ID N038: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 41 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



( i i ) MOLECULE TYPE: cDNA 
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( v i } ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOSS: 

ATTGTGACCT TAATTTTGTG ATCTCTTGAT TTTTATTTCA G 41 



( 2 ) INFORMATION FOR SEQ ID NOS9: 

( i ) SEQUENCE CHARACTERISTICS: 
{ A ) LENGTH: 18 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

< i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO 39: 
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( 2 ) INFORMATION FOR SEQ ID NO:40: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 18 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:40: 



( 2 ) INFORMATION FOR SEQ ID NO:41 : 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

< v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

< x i ) SEQUENCE DESCRIPTION: SEQ ID NO:41: 



( 2 ) INFORMATION FOR SEQ ID NO:42: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:42: 



/ 
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( 2 ) INFORMATION FOR SEQ ID NO.43: 

( i ) SEQUENCE CHARACTERISTICS : 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: angle 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:43: 



( 2 ) INFORMATION FOR SEQ ID NO:44: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( i i ) SEQUENCE DESCRIPTION: SEQ ID NO:44: 



( 2 ) INFORMATION FOR SEQ ID NO:45: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:45: 



( 2 ) INFORMATION FOR SEQ ID NO:46: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
{ C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

TGGGGCCATC TTGTTCCTGA 20 



( 2 ) INFORMATION FOR SEQ ID N047: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:47: 

ACATTAGGCA CAAAGCTTGC AA 22 



( 2 ) INFORMATION FOR SEQ ID NO:48: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

ATCAAGCTCC AGTAAGAAGG T A 22 



( 2 ) INFORMATION FOR SEQ ID NO:49: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:49: 



( 2 ) INFORMATION FOR SEQ ID NO:50: 

( i ) SEQUENCE CHARACTERISTICS: 
(A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

< i i ) MOLECULE TYPE: cDNA 

< v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO50: 



( 2 ) INFORMATION FOR SEQ ID NO^l: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

{ x i ) SEQUENCE DESCRIPTION: SEQ ID NOSI: 



TTTTCTCCTG CCTCTTACTG C 



109 



5,691,454 

-continued 



no 



( 2 ) INFORMATION FOR SEQ ID N0^2: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:52: 



( 2 ) INFORMATION FOR SEQ ID NO:53: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
< C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

< x i ) SEQUENCE DESCRIPTION: SEQ ID NOS3: 

CCACTTAAAG CACATATATT T AG T 24 



( 2 ) INFORMATION FOR SEQ ID N054: 

( i ) SEQUENCE CHARACTERISTICS: 
< A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: finear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:54: 

GTATGGAAAA TAGTGAAGAA CC 22 



( 2 ) INFORMATION FOR SEQ ID NO:55: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 case pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: finear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOS5: 



( 2 ) INFORMATION FOR SEQ ID N0^6: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
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( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:56: 

TTTAGAACCT TTTTTGTGTT GTG 



( 2 ) INFORMATION FOR SEQ ID NO:57: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

< A ) ORGANISM; Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:57: 

CTCAGATTAT ACACTAAGCC T A AC 



( 2 ) INFORMATION FOR SEQ ID NOS8: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
< D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

CATGTCTCTT ACAGTAGTAC CA 



( 2 ) INFORMATION FOR SEQ ID NOS9: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B )TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

< v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

< x i ) SEQUENCE DESCRIPTION: SEQ ID NO:59: 
AGGTCCAAGG GTAGCCAAGG 



( 2 ) INFORMATION FOR SEQ ID NO:60: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 27 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 



(ii ) SEQUENCE DESCRIPTION: SEQ ID NOrfO: 
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TAAAAATGGA TAAACTACAA TT AAA AG 



2 7 



( 2 ) INFORMATION FOR SEQ ID NO:61: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pans 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:61: 

AAATACAGAA TCATGTCTTG A AG T 



( 2 ) INFORMATION FOR SEQ ID NO:62: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

< i i ) MOLECULE TYPE: cDNA 

< v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:62: 
ACACCTAAAG AT G AC A AT T T GAG 



( 2 ) INFORMATION FOR SEQ ID NO:63: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

(x i ) SEQUENCE DESCRIPTION: SEQ ID NO:63: 



( 2 ) INFORMATION FOR SEQ ID NO.-64: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N0^4: 



( 2 ) INFORMATION FOR SEQ ID NO:65: 



( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
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( B ) TYPE: nucleic acid 

( C ) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 

ATAGGTCATT GCTTCTTGCT GAT 



( 2 ) INFORMATION FOR SEQ ID NO:66: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

{ v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

< x i ) SEQUENCE DESCRIPTION: SEQ ID NO:66: 

TGAATTTTAA TGGATTACCT AGGT 



( 2 ) INFORMATION FOR SEQ ID N0^7: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:67: 

CTTTTTTTGC TTTTACTGAT TAACG 



( 2 ) INFORMATION FOR SEQ ID NO^S: 

( i ) SEQUENCE CHARACTERISTICS: 

< A ) LENGTH: 27 base pairs 

< B ) TYPE: nucleic acid 

( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i > ORIGINAL SOURCE: 

{ A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOrf8: 

TGTAATTCAT TTTATTCCTA ATAGCTC 



( 2 ) INFORMATION FOR SEQ ID NOrf9: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 



( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 
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( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:69: 
GGTAGCCATA GTATGATTAT TTCT 



( 2 ) INFORMATION FOR SEQ ID NO:70: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pans 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:70: 

CTACCTATTT TT AT ACCCAC A A AC 



( 2 ) INFORMATION FOR SEQ ID NO:71: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D > TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A > ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:71: 

AAGAAAGCCT ACACCATTTT TGC 



{ 2 ) INFORMATION FOR SEQ ID NO:72: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: stogie 
( D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:72: 

GATCATTCTT AGAACCATCT TGC 



( 2 ) INFORMATION FOR SEQ ID NO:73: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE; cDNA 

( v i ) ORIGINAL SOURCE: 

( A > ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:73: 

ACCT AT AGTC TAAATTATAC CATC 



( 2 ) INFORMATION FOR SEQ ID NO:74: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:74: 



( 2 ) INFORMATION FOR SEQ ID NO:75: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

{ i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:75: 



( 2 ) INFORMATION FOR SEQ ID NO:7G: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 2t base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:76: 

TGAAGGACTC GGATTTCACG C 



( 2 ) INFORMATION FOR SEQ ID NO:77: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B )TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:77: 



( 2 ) INFORMATION FOR SEQ ID NO:78: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



( i i ) MOLECULE TYPE: cDNA 
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( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:78: 

GCTTTGAAAC ATGCACTACG AT 



( 2 ) INFORMATION FOR SEQ ID NO:79: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:79: 



( 2 ) INFORMATION FOR SEQ ED NO:80: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:80: 

TACCATGATT TAAAAATCCA CCAG 24 



( 2 ) INFORMATION FOR SEQ ID NO:81: 

( i ) SEQUENCE CHARACTERISTICS: 
< A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:81: 



( 2 ) INFORMATION FOR SEQ ID NO:82: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
< B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:82: 



CTGAGCTATC TTAAGAAATA CATG 
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( 2 ) INFORMATION FOR SEQ ID NO:83: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:83: 

TTTTAAATGA TCCTjCTATTC TGTAT 25 



( 2 ) INFORMATION FOR SEQ ID NO:84: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



( i i ) MOLECULE TYPE: cDNA 



( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:84: 

ACAGAGTCAG ACCCTGCCTC A A AG 24 



( 2 ) INFORMATION FOR SEQ ID NO:85: 

( i ) SEQUENCE CHARACTERISTICS: 
{ A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:&5: 



( 2 ) INFORMATION FOR SEQ ID NO:86: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:86: 



{ 2 ) INFORMATION FOR SEQ ID NO:87: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:87: 

TAGATGACCC ATATTCTGTT TC 22 



( 2 ) INFORMATION FOR SEQ ID NO:88: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

< v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N03S: 

CAATTAGGTC T T T T T GAG AG T A 22 



( 2 ) INFORMATION FOR SEQ ID NO:89: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:89: 



( 2 ) INFORMATION FOR SEQ ID NO30: 

( t ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 23 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

< A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOSO: 



( 2 ) INFORMATION FOR SEQ ID NOSl : 

( i ) SEQUENCE CHARACTERISTICS: t 
( A ) LENGTH: 21 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 

( D ) TOPOLOGY: linear } 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N031: 



TCTCCCACAG GTAATACTCC C 
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( 2 ) INFORMATION FOR SEQ ID N052: 

( i ) SEQUENCE CHARACTERISTICS : 
( A ) LENGTH: 21 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: 

GCTAGAACTG AATGGGGTAC G 



( 2 ) INFORMATION FOR SEQ ID N053: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapi 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:93: 
CAGGACAAAA TAATCCTGTC CC 

( 2 ) INFORMATION FOR SEQ ID NOd4: 

( i ) SEQUENCE CHARACTERISTICS: 

< A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic add 

( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

{ i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

< A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:94: 
ATTTTCTTAG TTTCATTCTT CCTC 

( 2 ) INFORMATION FOR SEQ ID NO: 95: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N095: 

AGAAGGATCC CTTGTGCAGT GTGGA 



( 2 ) INFORMATION FOR SEQ ID NO: 96: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 24 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
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( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N056: 

GACAGGATCC TGAAGCTGAG TTTG 24 



( 2 ) INFORMATION FOR SEQ ID NO: 97: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 18 base pairs 
{ B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOS7: 



( 2 ) INFORMATION FOR SEQ ID NO: 98: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
(D ) TOPOLOGY: Enear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:9S: 



( 2 ) INFORMATION FOR SEQ ID NO: 99: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 base pairs 
( B ) TYPE: nnciek acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: Enear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO 59: 

GCAAATCCTA AGAGAGAACA A 



( 2 ) INFORMATION FOR SEQ ID NO: 100: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: Enear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:100: 
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GATGOCAAGC TTGAGCCAG 



( 2 ) INFORMATION FOR SEQ ID NO: 101 : 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 18 base pairs 
{ B ) TYPE: nucleic add 
( C > STRANDEDNESS: single 
( D > TOPOLOGY: linear 

( i i ) MOLECULE TYPE: cDNA 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

GTTCCAGCAG TGTCACAG 



( 2 ) INFORMATION FOR SEQ ID NO: 102: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 18 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: tinge 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE; cDN A 

( v i ) ORIGINAL SOURCE: 

( A ) ORGANISM: Homo sapiens 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:102: 

GGGAGATTTC GCTCCTGA 
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We claim: 

1. A preparation of antibodies which specifically binds to 
a human APC (adenomatous polyposis coH) protein having 
an amino acid sequence as shown in SEQ ID NO: 1, 2, or 7, 
and does not specifically bind to other human proteins. 

2. A preparation of antibodies which specifically binds to 
a human APC protein which is the product of a mutant allele 
found in a tumor, wherein the antibodies do not specifically 
bind to other human proteins, and wherein the human APC 
protein is a mutant form of the amino acid sequence shown 
in SEQ ID NOS:2 and 7, and the mutant allele is a mutant 
form of the nucleotide sequence shown in SEQ ID NO:l. 

3. The preparation of claim 2 wherein the mutant allele 
contains a mutation selected from the group consisting of 
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mutations at codons 243, 279, 288, 301331,413,437, 456, 
500, 712, and 1338. 

4. The preparation of claim 2 wherein the mutant allele 
contains a premature stop codon. 

5. The preparation of claim 2 wherein the mutant allele 
contains a missense mutation. 

6. The preparation of claim 2 wherein the mutant allele 
contains a frameshift mutation. 

7. The preparation of claim 2 wherein the mutant allele 
contains a splice junction mutation. 

8. The preparation of claim 2 wherein the mutant allele 
contains an insertion mutation. 



* * * * * 
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(vii) IMMEDIATE SOURCE: 

(B) CLONE: DP2.5(APC) 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 34 8562 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
GGACTCGGAA ATGAGGTCCA AGGGTAGCCA AGG ATG GCT GCA GCT TCA TAT GAT 54 



Met Ala Ala Ala Ser Tyr Asp 
1 5 



CAG TTG TTA AAG CAA GTT GAG GCA CTG AAG ATG GAG AAC TCA AAT CTT 
Gin Leu Leu Lys Gin Val Glu Ala Leu Lys Met Glu Asn Ser Asn Leu 
10 15 20 



102 



CGA CAA GAG CTA GAA GAT AAT TCC AAT CAT CTT ACA AAA CTG GAA ACT 
Arg Gin Glu Leu Glu Asp Asn Ser Asn His Leu Thr Lys Leu Glu Thr 
25 30 35 



150 



GAG GCA TCT AAT ATG AAG GAA GTA CTT AAA CAA CTA CAA GGA AGT ATT 
Glu Ala Ser Asn Met Lys Glu Val Leu Lys Gin Leu Gin Gly Ser lie 
40 45 50 55 



198 



GAA GAT GAA GCT ATG GCT TCT TCT GGA CAG ATT GAT TTA TTA GAG CGT 
Glu Asp Glu Ala Met Ala Ser Ser Gly Gin He Asp Leu Leu Glu Arg 
60 65 70 



246 



CTT AAA GAG CTT AAC TTA GAT AGC AGT AAT TTC CCT GGA GTA AAA CTG 
Leu Lys Glu Leu Asn Leu Asp Ser Ser Asn Phe Pro Gly Val Lys Leu 
75 80 85 



294 



CGG TCA AAA ATG TCC CTC CGT TCT TAT GGA AGC CGG GAA GGA TCT GTA 
Arg Ser Lys Met Ser Leu Arg Ser Tyr Gly Ser Arg Glu Gly Ser Val 



342 



2 



90 95 100 

TCA AGC CGT TCT GGA GAG TGC AGT CCT GTT CCT ATG GGT TCA TTT CCA 3 90 

Ser Ser Arg Ser Gly Glu Cys Ser Pro Val Pro Met Gly Ser Phe Pro 
105 110 115 

AGA AGA GGG TTT GTA AAT GGA AGC AGA GAA AGT ACT GGA TAT TTA GAA 43 8 

Arg Arg Gly Phe Val Asn Gly Ser Arg Glu Ser Thr Gly Tyr Leu Glu 
120 125 130 135 

GAA CTT GAG AAA GAG AGG TCA TTG CTT CTT GCT GAT CTT GAC AAA GAA 4 86 

Glu Leu Glu Lys Glu Arg Ser Leu Leu Leu Ala Asp Leu Asp Lys Glu 
140 145 150 

GAA AAG GAA AAA GAC TGG TAT TAC GCT CAA CTT CAG AAT CTC ACT AAA 534 
Glu Lys Glu Lys Asp Trp Tyr Tyr Ala Gin Leu Gin Asn Leu Thr Lys 
i55 160 165 

AGA ATA GAT AGT CTT CCT TTA ACT GAA AAT TTT TCC TTA CAA ACA GAT 582 
Arg lie Asp Ser Leu Pro Leu Thr Glu Asn Phe Ser Leu Gin Thr Asp 
I 70 175 180 

TTG ACC AGA AGG CAA TTG GAA TAT GAA GCA AGG CAA ATC AGA GTT GCG 63 0 

Leu Thr Arg Arg Gin Leu Glu Tyr Glu Ala Arg Gin He Arg Val Ala 
185 190 195 

ATG GAA GAA CAA CTA GGT ACC TGC CAG GAT ATG GAA AAA CGA GCA CAG 678 
Met Glu Glu Gin Leu Gly Thr Cys Gin Asp Met Glu Lys Arg Ala Gin 
200 205 210 215 

CGA AGA ATA GCC AGA ATT CAG CAA ATC GAA AAG GAC ATA CTT CGT ATA 72 6 

Arg Arg He Ala Arg He Gin Gin He Glu Lys Asp He Leu Arg He 
220 225 230 

CGA CAG CTT TTA CAG TCC CAA GCA ACA GAA GCA GAG AGG TCA TCT CAG 774 
Arg Gin Leu Leu Gin Ser Gin Ala Thr Glu Ala Glu Arg Ser Ser Gin 
235 240 245 

AAC AAG CAT GAA ACC GGC TCA CAT GAT GCT GAG CGG CAG AAT GAA GGT 822 
Asn Lys His Glu Thr Gly Ser His Asp Ala Glu Arg Gin Asn Glu Gly 
250 255 260 

CAA GGA GTG GGA GAA ATC AAC ATG GCA ACT TCT GGT AAT GGT CAG GGT 870 
Gin Gly Val Gly Glu He Asn Met Ala Thr Ser Gly Asn Gly Gin Gly 
265 270 275 

TCA ACT ACA CGA ATG GAC CAT GAA ACA GCC AGT GTT TTG AGT TCT AGT 918 
Ser Thr Thr Arg Met Asp His Glu Thr Ala Ser Val Leu Ser Ser Ser 
280 285 290 295 

AGC ACA CAC TCT GCA CCT CGA AGG CTG ACA AGT CAT CTG GGA ACC AAG 966 
Ser Thr His Ser Ala Pro Arg Arg Leu Thr Ser His Leu Gly Thr Lys 
300 305 310 

GTG GAA ATG GTG TAT TCA TTG TTG TCA ATG CTT GGT ACT CAT GAT AAG 1014 



3 



Val Glu Met Val Tyr Ser Leu Leu Ser Met Leu Gly Thr His Asp Lys 
315 320 325 

GAT GAT ATG TCG CGA ACT TTG CTA GCT ATG TCT AGC TCC CAA GAC AGC 1062 
Asp Asp Met Ser Arg Thr Leu Leu Ala Met Ser Ser Ser Gin Asp Ser 
330 335 340 

TGT ATA TCC ATG CGA CAG TCT GGA TGT CTT CCT CTC CTC ATC CAG CTT 1110 
Cys lie Ser Met Arg Gin Ser Gly Cys Leu Pro Leu Leu lie Gin Leu 
345 350 355 

TTA CAT GGC AAT GAC AAA GAC TCT GTA TTG TTG GGA AAT TCC CGG GGC 1158 
Leu His Gly Asn Asp Lys Asp Ser Val Leu Leu Gly Asn Ser Arg Gly 
360 365 370 375 

AGT AAA GAG GCT CGG GCC AGG GCC AGT GCA GCA CTC CAC AAC ATC ATT 1206 
Ser Lys Glu Ala Arg Ala Arg Ala Ser Ala Ala Leu His Asn lie lie 
380 385 390 

CAC TCA CAG CCT GAT GAC AAG AGA GGC AGG CGT GAA ATC CGA GTC CTT 1254 
His Ser Gin Pro Asp Asp Lys Arg Gly Arg Arg Glu lie Arg Val Leu 
395 400 405 

CAT CTT TTG GAA CAG ATA CGC GCT TAC TGT GAA ACC TGT TGG GAG TGG 1302 
His Leu Leu Glu Gin lie Arg Ala Tyr Cys Glu Thr Cys Trp Glu Trp 
410 415 420 

CAG GAA GCT CAT GAA CCA GGC ATG GAC CAG GAC AAA AAT CCA ATG CCA 1350 
Gin Glu Ala His Glu Pro Gly Met Asp Gin Asp Lys Asn Pro Met Pro 
425 430 435 

GCT CCT GTT GAA CAT CAG ATC TGT CCT GCT GTG TGT GTT CTA ATG AAA 13 98 

Ala Pro Val Glu His Gin lie Cys Pro Ala Val Cys Val Leu Met Lys 
440 445 450 455 

CTT TCA TTT GAT GAA GAG CAT AGA CAT GCA ATG AAT GAA CTA GGG GGA 1446 
Leu Ser Phe Asp Glu Glu His Arg His Ala Met Asn Glu Leu Gly Gly 
460 465 470 

CTA CAG GCC ATT GCA GAA TTA TTG CAA GTG GAC TGT GAA ATG TAT GGG 1494 
Leu Gin Ala lie Ala Glu Leu Leu Gin Val Asp Cys Glu Met Tyr Gly 
475 480 485 

CTT ACT AAT GAC CAC TAC AGT ATT ACA CTA AGA CGA TAT GCT GGA ATG 1542 
Leu Thr Asn Asp His Tyr Ser lie Thr Leu Arg Arg Tyr Ala Gly Met 
490 495 500 

GCT TTG ACA AAC TTG ACT TTT GGA GAT GTA GCC AAC AAG GCT ACG CTA 1590 
Ala Leu Thr Asn Leu Thr Phe Gly Asp Val Ala Asn Lys Ala Thr Leu 
505 510 515 

TGC TCT ATG AAA GGC TGC ATG AGA GCA CTT GTG GCC CAA CTA AAA TCT 1638 
Cys Ser Met Lys Gly Cys Met Arg Ala Leu Val Ala Gin Leu Lys Ser 
520 525 530 535 



4 
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GAA AGT GAA GAC TTA CAG CAG GTT ATT GCA AGT GTT TTG AGG AAT TTG 1686 
Glu Ser Glu Asp Leu Gin Gin Val He Ala Ser Val Leu Arg Asn Leu 
540 545 550 

TCT TGG CGA GCA GAT GTA AAT AGT AAA AAG ACG TTG CGA GAA GTT GGA 1734 
Ser Trp Arg Ala Asp Val Asn Ser Lys Lys Thr Leu Arg Glu Val Gly 
555 560 565 

AGT GTG AAA GCA TTG ATG GAA TGT GCT TTA GAA GTT AAA AAG GAA TCA 1782 
Ser Val Lys Ala Leu Met Glu Cys Ala Leu Glu Val Lys Lys Glu Ser 
570 575 580 

ACC CTC AAA AGC GTA TTG AGT GCC TTA TGG AAT TTG TCA GCA CAT TGC 1830 
Thr Leu Lys Ser Val Leu Ser Ala Leu Trp Asn Leu Ser Ala His Cys 
585 590 595 

ACT GAG AAT AAA GCT GAT ATA TGT GCT GTA GAT GGT GCA CTT GCA TTT 1878 
Thr Glu Asn Lys Ala Asp He Cys Ala Val Asp Gly Ala Leu Ala Phe 
600 605 610 615 

TTG GTT GGC ACT CTT ACT TAC CGG AGC CAG ACA AAC ACT TTA GCC ATT 1926 
Leu Val Gly Thr Leu Thr Tyr Arg Ser Gin Thr Asn Thr Leu Ala He 
620 625 630 

ATT GAA AGT GGA GGT GGG ATA TTA CGG AAT GTG TCC AGC TTG ATA GCT 1974 
He Glu Ser Gly Gly Gly lie Leu Arg Asn Val Ser Ser Leu He Ala 
635 640 645 

ACA AAT GAG GAC CAC AGG CAA ATC CTA AGA GAG AAC AAC TGT CTA CAA 2022 
Thr Asn Glu Asp His Arg Gin He Leu Arg Glu Asn Asn Cys Leu Gin 
650 655 660 

ACT TTA TTA CAA CAC TTA AAA TCT CAT AGT TTG ACA ATA GTC AGT AAT 2070 
Thr Leu Leu Gin His Leu Lys Ser His Ser Leu Thr He Val Ser Asn 
665 670 675 

GCA TGT GGA ACT TTG TGG AAT CTC TCA GCA AGA AAT CCT AAA GAC CAG 2118 
Ala Cys Gly Thr Leu Trp Asn Leu Ser Ala Arg Asn Pro Lys Asp Gin 
680 685 690 695 

GAA GCA TTA TGG GAC ATG GGG GCA GTT AGC ATG CTC AAG AAC CTC ATT 2166 
Glu Ala Leu Trp Asp Met Gly Ala Val Ser Met Leu Lys Asn Leu He 
700 705 710 

CAT TCA AAG CAC AAA ATG ATT GCT ATG GGA AGT GCT GCA GCT TTA AGG 2214 
His Ser Lys His Lys Met He Ala Met Gly Ser Ala Ala Ala Leu Arg 
715 720 725 

AAT CTC ATG GCA AAT AGG CCT GCG AAG TAC AAG GAT GCC AAT ATT ATG 2262 
Asn Leu Met Ala Asn Arg Pro Ala Lys Tyr Lys Asp Ala Asn He Met 
730 735 740 

TCT CCT GGC TCA AGC TTG CCA TCT CTT CAT GTT AGG AAA CAA AAA GCC 2 310 

Ser Pro Gly Ser Ser Leu Pro Ser Leu His Val Arg Lys Gin Lys Ala 
745 750 755 
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CTA GAA GCA GAA TTA GAT GCT GAG CAC TTA TCA GAA ACT TTT GAC AAT 235 8 

Leu Glu Ala Glu Leu Asp Ala Gin His Leu Ser Glu Thr Phe Asp Asn 
760 765 770 775 

ATA GAC AAT TTA AGT CCC AAG GCA TCT CAT CGT AGT AAG CAG AGA CAC 2406 
He Asp Asn Leu Ser Pro Lys Ala Ser His Arg Ser Lys Gin Arg His 
780 785 790 

AAG CAA AGT CTC TAT GGT GAT TAT GTT TTT GAC ACC AAT CGA CAT GAT 2454 
Lys Gin Ser Leu Tyr Gly Asp Tyr Val Phe Asp Thr Asn Arg His Asp 
795 800 805 

GAT AAT AGG TCA GAC AAT TTT AAT ACT GGC AAC ATG ACT GTC CTT TCA 25 02 

Asp Asn Arg Ser Asp Asn Phe Asn Thr Gly Asn Met Thr Val Leu Ser 
810 815 820 

CCA TAT TTG AAT ACT ACA GTG TTA CCC AGC TCC TCT TCA TCA AGA GGA 2550 
Pro Tyr Leu Asn Thr Thr Val Leu Pro Ser Ser Ser Ser Ser Arg Gly 
825 830 835 

AGC TTA GAT AGT TCT CGT TCT GAA AAA GAT AGA AGT TTG GAG AGA GAA 2598 
Ser Leu Asp Ser Ser Arg Ser Glu Lys Asp Arg Ser Leu Glu Arg Glu 
840 845 850 855 

CGC GGA ATT GGT CTA GGC AAC TAC CAT CCA GCA ACA GAA AAT CCA GGA 2646 
Arg Gly He Gly Leu Gly Asn Tyr His Pro Ala Thr Glu Asn Pro Gly 
860 865 870 

ACT TCT TCA AAG CGA GGT TTG CAG ATC TCC ACC ACT GCA GCC CAG ATT 2 694 

Thr Ser Ser Lys Arg Gly Leu Gin He Ser Thr Thr Ala Ala Gin He 
875 880 885 

GCC AAA GTC ATG GAA GAA GTG TCA GCC ATT CAT ACC TCT CAG GAA GAC 2 742 

Ala Lys Val Met Glu Glu Val Ser Ala He His Thr Ser Gin Glu Asp 
890 895 900 

AGA AGT TCT GGG TCT ACC ACT GAA TTA CAT TGT GTG ACA GAT GAG AGA 2790 
Arg Ser Ser Gly Ser Thr Thr Glu Leu His Cys Val Thr Asp Glu Arg 
905 910 915 

AAT GCA CTT AGA AGA AGC TCT GCT GCC CAT ACA CAT TCA AAC ACT TAC 2 83 8 

Asn Ala Leu Arg Arg Ser Ser Ala Ala His Thr His Ser Asn Thr Tyr 
^20 925 930 935 

AAT TTC ACT AAG TCG GAA AAT TCA AAT AGG ACA TGT TCT ATG CCT TAT 2 886 

Asn Phe Thr Lys Ser Glu Asn Ser Asn Arg Thr Cys Ser Met Pro Tyr 
940 945 950 

GCC AAA TTA GAA TAC AAG AGA TCT TCA AAT GAT AGT TTA AAT AGT GTC 2 934 

Ala Lys Leu Glu Tyr Lys Arg Ser Ser Asn Asp Ser Leu Asn Ser Val 
955 960 965 

AGT AGT AAT GAT GGT TAT GGT AAA AGA GGT CAA ATG AAA CCC TCG ATT 2 982 

Ser Ser Asn Asp Gly Tyr Gly Lys Arg Gly Gin Met Lys Pro Ser He 
^70 975 980 
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GAA TCC TAT TCT GAA GAT GAT GAA AGT AAG TTT TGC AGT TAT GGT CAA 3 03 0 

Glu Ser Tyr Ser Glu Asp Asp Glu Ser Lys Phe Cys Ser Tyr Gly Gin 
985 990 995 

TAC CCA GCC GAC CTA GCC CAT AAA ATA CAT AGT GCA AAT CAT ATG GAT 3 078 

Tyr Pro Ala Asp Leu Ala His Lys lie His Ser Ala Asn His Met Asp 
1000 1005 1010 1015 

GAT AAT GAT GGA GAA CTA GAT ACA CCA ATA AAT TAT AGT CTT AAA TAT 312 6 

Asp Asn Asp Gly Glu Leu Asp Thr Pro He Asn Tyr Ser Leu Lys Tyr 
1020 1025 1030 

TCA GAT GAG CAG TTG AAC TCT GGA AGG CAA AGT CCT TCA CAG AAT GAA 3174 
Ser Asp Glu Gin Leu Asn Ser Gly Arg Gin Ser Pro Ser Gin Asn Glu 
1035 1040 1045 

AGA TGG GCA AGA CCC AAA CAC ATA ATA GAA GAT GAA ATA AAA CAA AGT 3222 
Arg Trp Ala Arg Pro Lys His He He Glu Asp Glu He Lys Gin Ser 
1050 1055 1060 

GAG CAA AGA CAA TCA AGG AAT CAA AGT ACA ACT TAT CCT GTT TAT ACT 32 70 

Glu Gin Arg Gin Ser Arg Asn Gin Ser Thr Thr Tyr Pro Val Tyr Thr 
1065 1070 1075 

GAG AGC ACT GAT GAT AAA CAC CTC AAG TTC CAA CCA CAT TTT GGA CAG 3318 
Glu Ser Thr Asp Asp Lys His Leu Lys Phe Gin Pro His Phe Gly Gin 
1080 1085 1090 1095 

CAG GAA TGT GTT TCT CCA TAC AGG TCA CGG GGA GCC AAT GGT TCA GAA 3 3 66 

Gin Glu Cys Val Ser Pro Tyr Arg Ser Arg Gly Ala Asn Gly Ser Glu 
HOO 1105 1110 

ACA AAT CGA GTG GGT TCT AAT CAT GGA ATT AAT CAA AAT GTA AGC CAG 3414 
Thr Asn Arg Val Gly Ser Asn His Gly He Asn Gin Asn Val Ser Gin 
H15 1120 1125 

TCT TTG TGT CAA GAA GAT GAC TAT GAA GAT GAT AAG CCT ACC AAT TAT 3462 
Ser Leu Cys Gin Glu Asp Asp Tyr Glu Asp Asp Lys Pro Thr Asn Tyr 
H30 1135 1140 

AGT GAA CGT TAC TCT GAA GAA GAA CAG CAT GAA GAA GAA GAG AGA CCA 3510 
Ser Glu Arg Tyr Ser Glu Glu Glu Gin His Glu Glu Glu Glu Arg Pro 
H45 1150 1155 

ACA AAT TAT AGC ATA AAA TAT AAT GAA GAG AAA CGT CAT GTG GAT CAG 355 8 

Thr Asn Tyr Ser He Lys Tyr Asn Glu Glu Lys Arg His Val Asp Gin 
H60 H65 H70 H75 

CCT ATT GAT TAT AGT TTA AAA TAT GCC ACA GAT ATT CCT TCA TCA CAG 3 606 

Pro He Asp Tyr Ser Leu Lys Tyr Ala Thr Asp He Pro Ser Ser Gin 
H80 1185 1190 

AAA CAG TCA TTT TCA TTC TCA AAG AGT TCA TCT GGA CAA AGC AGT AAA 3654 
Lys Gin Ser Phe Ser Phe Ser Lys Ser Ser Ser Gly Gin Ser Ser Lys 
1:3 -95 1200 1205 
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ACC GAA CAT ATG TCT TCA AGC AGT GAG AAT ACG TCC ACA CCT TCA TCT 3702 
Thr Glu His Met Ser Ser Ser Ser Glu Asn Thr Ser Thr Pro Ser Ser 
1210 1215 1220 

AAT GCC AAG AGG CAG AAT CAG CTC CAT CCA AGT TCT GCA CAG AGT AGA 3 750 

Asn Ala Lys Arg Gin Asn Gin Leu His Pro Ser Ser Ala Gin Ser Arg 
1225 1230 1235 

AGT GGT CAG CCT CAA AAG GCT GCC ACT TGC AAA GTT TCT TCT ATT AAC 3798 
Ser Gly Gin Pro Gin Lys Ala Ala Thr Cys Lys Val Ser Ser He Asn 
1240 1245 1250 1255 

CAA GAA ACA ATA CAG ACT TAT TGT GTA GAA GAT ACT CCA ATA TGT TTT 3 846 

Gin Glu Thr He Gin Thr Tyr Cys Val Glu Asp Thr Pro He Cys Phe 
1260 1265 1270 

TCA AGA TGT AGT TCA TTA TCA TCT TTG TCA TCA GCT GAA GAT GAA ATA 3 894 

Ser Arg Cys Ser Ser Leu Ser Ser Leu Ser Ser Ala Glu Asp Glu He 
1275 1280 1285 

GGA TGT AAT CAG ACG ACA CAG GAA GCA GAT TCT GCT AAT ACC CTG CAA 3 942 

Gly Cys Asn Gin Thr Thr Gin Glu Ala Asp Ser Ala Asn Thr Leu Gin 
1290 1295 1300 

ATA GCA GAA ATA AAA GGA AAG ATT GGA ACT AGG TCA GCT GAA GAT CCT 3 990 

He Ala Glu He Lys Gly Lys He Gly Thr Arg Ser Ala Glu Asp Pro 
1305 1310 1315 

GTG AGC GAA GTT CCA GCA GTG TCA CAG CAC CCT AGA ACC AAA TCC AGC 4 03 8 

Val Ser Glu Val Pro Ala Val Ser Gin His Pro Arg Thr Lys Ser Ser 
1320 1325 1330 1335 

AGA CTG CAG GGT TCT AGT TTA TCT TCA GAA TCA GCC AGG CAC AAA GCT 4 086 

Arg Leu Gin Gly Ser Ser Leu Ser Ser Glu Ser Ala Arg His Lys Ala 
1340 1345 1350 

GTT GAA TTT CCT TCA GGA GCG AAA TCT CCC TCC AAA AGT GGT GCT CAG 4134 
Val Glu Phe Pro Ser Gly Ala Lys Ser Pro Ser Lys Ser Gly Ala Gin 
1355 1360 1365 

ACA CCC AAA AGT CCA CCT GAA CAC TAT GTT CAG GAG ACC CCA CTC ATG 4182 
Thr Pro Lys Ser Pro Pro Glu His Tyr Val Gin Glu Thr Pro Leu Met 
1370 1375 1380 

TTT AGC AGA TGT ACT TCT GTC AGT TCA CTT GAT AGT TTT GAG AGT CGT 423 0 

Phe Ser Arg Cys Thr Ser Val Ser Ser Leu Asp Ser Phe Glu Ser Arg 
1385 1390 1395 

TCG ATT GCC AGC TCC GTT CAG AGT GAA CCA TGC AGT GGA ATG GTA AGT 42 7 8 

Ser He Ala Ser Ser Val Gin Ser Glu Pro Cys Ser Gly Met Val Ser 
1400 1405 1410 1415 

GGC ATT ATA AGC CCC AGT GAT CTT CCA GAT AGC CCT GGA CAA ACC ATG 432 6 

Gly He He Ser Pro Ser Asp Leu Pro Asp Ser Pro Gly Gin Thr Met 
1420 1425 1430 
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CCA CCA AGC AGA AGT AAA ACA CCT CCA CCA CCT CCT CAA ACA GCT CAA 4374 
Pro Pro Ser Arg Ser Lys Thr Pro Pro Pro Pro Pro Gin Thr Ala Gin 
1435 1440 1445 

ACC AAG CGA GAA GTA CCT AAA AAT AAA GCA CCT ACT GCT GAA AAG AGA 4422 
Thr Lys Arg Glu Val Pro Lys Asn Lys Ala Pro Thr Ala Glu Lys Arg 
!450 1455 1460 

GAG AGT GGA CCT AAG CAA GCT GCA GTA AAT GCT GCA GTT CAG AGG GTC 4470 
Glu Ser Gly Pro Lys Gin Ala Ala Val Asn Ala Ala Val Gin Arg Val 
1465 1470 1475 

CAG GTT CTT CCA GAT GCT GAT ACT TTA TTA CAT TTT GCC ACA GAA AGT 4518 
Gin Val Leu Pro Asp Ala Asp Thr Leu Leu His Phe Ala Thr Glu Ser 
1480 1485 1490 1495 

ACT CCA GAT GGA TTT TCT TGT TCA TCC AGC CTG AGT GCT CTG AGC CTC 4566 
Thr Pro Asp Gly Phe Ser Cys Ser Ser Ser Leu Ser Ala Leu Ser Leu 
1500 1505 1510 

GAT GAG CCA TTT ATA CAG AAA GAT GTG GAA TTA AGA ATA ATG CCT CCA 4614 
Asp Glu Pro Phe He Gin Lys Asp Val Glu Leu Arg He Met Pro Pro 
1515 1520 1525 

GTT CAG GAA AAT GAC AAT GGG AAT GAA ACA GAA TCA GAG CAG CCT AAA 4 662 

Val Gin Glu Asn Asp Asn Gly Asn Glu Thr Glu Ser Glu Gin Pro Lys 
1530 1535 1540 

GAA TCA AAT GAA AAC CAA GAG AAA GAG GCA GAA AAA ACT ATT GAT TCT 4710 
Glu Ser Asn Glu Asn Gin Glu Lys Glu Ala Glu Lys Thr He Asp Ser 
1545 1550 I555 

GAA AAG GAC CTA TTA GAT GAT TCA GAT GAT GAT GAT ATT GAA ATA CTA 475 8 

Glu Lys Asp Leu Leu Asp Asp Ser Asp Asp Asp Asp He Glu He Leu 
1560 1565 1570 1575 

GAA GAA TGT ATT ATT TCT GCC ATG CCA ACA AAG TCA TCA CGT AAA GGC 4 806 

Glu Glu Cys He He Ser Ala Met Pro Thr Lys Ser Ser Arg Lys Gly 
1580 1585 1590 

AAA AAG CCA GCC CAG ACT GCT TCA AAA TTA CCT CCA CCT GTG GCA AGG 4854 
Lys Lys Pro Ala Gin Thr Ala Ser Lys Leu Pro Pro Pro Val Ala Arg 
1595 1600 1605 

AAA CCA AGT CAG CTG CCT GTG TAC AAA CTT CTA CCA TCA CAA AAC AGG 4902 
Lys Pro Ser Gin Leu Pro Val Tyr Lys Leu Leu Pro Ser Gin Asn Arg 
1610 1615 1620 

TTG CAA CCC CAA AAG CAT GTT AGT TTT ACA CCG GGG GAT GAT ATG CCA 4950 
Leu Gin Pro Gin Lys His Val Ser Phe Thr Pro Gly Asp Asp Met Pro 
1^25 1630 1635 



CGG GTG TAT TGT GTT GAA GGG ACA CCT ATA AAC TTT TCC ACA GCT ACA 
Arg Val Tyr Cys Val Glu Gly Thr Pro He Asn Phe Ser Thr Ala Thr 
1640 1645 1650 1655 



4998 
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TCT CTA AGT GAT CTA ACA ATC GAA TCC CCT CCA AAT GAG TTA GCT GCT 5046 
Ser Leu Ser Asp Leu Thr He Glu Ser Pro Pro Asn Glu Leu Ala Ala 
1660 1665 1670 

GGA GAA GGA GTT AGA GGA GGA GCA CAG TCA GGT GAA TTT GAA AAA CGA 5094 
Gly Glu Gly Val Arg Gly Gly Ala Gin Ser Gly Glu Phe Glu Lys Arg 
1675 1680 1685 

GAT ACC ATT CCT ACA GAA GGC AGA AGT ACA GAT GAG GCT CAA GGA GGA 5142 
Asp Thr He Pro Thr Glu Gly Arg Ser Thr Asp Glu Ala Gin Gly Gly 
1690 1695 1700 

AAA ACC TCA TCT GTA ACC ATA CCT GAA TTG GAT GAC AAT AAA GCA GAG 5190 
Lys Thr Ser Ser Val Thr He Pro Glu Leu Asp Asp Asn Lys Ala Glu 
1705 1710 1715 

GAA GGT GAT ATT CTT GCA GAA TGC ATT AAT TCT GCT ATG CCC AAA GGG 5238 
Glu Gly Asp He Leu Ala Glu Cys He Asn Ser Ala Met Pro Lys Gly 
1720 1725 1730 1735 

AAA AGT CAC AAG CCT TTC CGT GTG AAA AAG ATA ATG GAC CAG GTC CAG 52 86 

Lys Ser His Lys Pro Phe Arg Val Lys Lys He Met Asp Gin Val Gin 
1740 1745 1750 

CAA GCA TCT GCG TCG TCT TCT GCA CCC AAC AAA AAT CAG TTA GAT GGT 5334 
Gin Ala Ser Ala Ser Ser Ser Ala Pro Asn Lys Asn Gin Leu Asp Gly 
1755 1760 1765 

AAG AAA AAG AAA CCA ACT TCA CCA GTA AAA CCT ATA CCA CAA AAT ACT 53 82 

Lys Lys Lys Lys Pro Thr Ser Pro Val Lys Pro He Pro Gin Asn Thr 
1770 1775 1780 

GAA TAT AGG ACA CGT GTA AGA AAA AAT GCA GAC TCA AAA AAT AAT TTA 543 0 

Glu Tyr Arg Thr Arg Val Arg Lys Asn Ala Asp Ser Lys Asn Asn Leu 
1785 1790 1795 

AAT GCT GAG AGA GTT TTC TCA GAC AAC AAA GAT TCA AAG AAA CAG AAT 5478 
Asn Ala Glu Arg Val Phe Ser Asp Asn Lys Asp Ser Lys Lys Gin Asn 
1800 1805 1810 1815 

TTG AAA AAT AAT TCC AAG GAC TTC AAT GAT AAG CTC CCA AAT AAT GAA 552 6 

Leu Lys Asn Asn Ser Lys Asp Phe Asn Asp Lys Leu Pro Asn Asn Glu 
1820 1825 1830 

GAT AGA GTC AGA GGA AGT TTT GCT TTT GAT TCA CCT CAT CAT TAC ACG 5574 
Asp Arg Val Arg Gly Ser Phe Ala Phe Asp Ser Pro His His Tyr Thr 
1835 1840 1845 

CCT ATT GAA GGA ACT CCT TAC TGT TTT TCA CGA AAT GAT TCT TTG AGT 5622 
Pro He Glu Gly Thr Pro Tyr Cys Phe Ser Arg Asn Asp Ser Leu Ser 
1850 1855 I860 

TCT CTA GAT TTT GAT GAT GAT GAT GTT GAC CTT TCC AGG GAA AAG GCT 5670 
Ser Leu Asp Phe Asp Asp Asp Asp Val Asp Leu Ser Arg Glu Lys Ala 
1865 1870 1875 
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GAA TTA AGA AAG GCA AAA GAA AAT AAG GAA TCA GAG GCT AAA GTT ACC 5718 
Glu Leu Arg Lys Ala Lys Glu Asn Lys Glu Ser Glu Ala Lys Val Thr 
lg 80 1885 1890 1895 

AGC CAC ACA GAA CTA ACC TCC AAC CAA CAA TCA GCT AAT AAG ACA CAA 5766 
Ser His Thr Glu Leu Thr Ser Asn Gin Gin Ser Ala Asn Lys Thr Gin 
1900 1905 1910 

GCT ATT GCA AAG CAG CCA ATA AAT CGA GGT CAG CCT AAA CCC ATA CTT 5 814 

Ala He Ala Lys Gin Pro He Asn Arg Gly Gin Pro Lys Pro He Leu 
1915 1920 1925 

CAG AAA CAA TCC ACT TTT CCC CAG TCA TCC AAA GAC ATA CCA GAC AGA 5862 
Gin Lys Gin Ser Thr Phe Pro Gin Ser Ser Lys Asp He Pro Asp Arg 
1930 1935 1940 

GGG GCA GCA ACT GAT GAA AAG TTA CAG AAT TTT GCT ATT GAA AAT ACT 5910 
Gly Ala Ala Thr Asp Glu Lys Leu Gin Asn Phe Ala He Glu Asn Thr 
1^45 1950 1955 

CCA GTT TGC TTT TCT CAT AAT TCC TCT CTG AGT TCT CTC AGT GAC ATT 5958 
Pro Val Cys Phe Ser His Asn Ser Ser Leu Ser Ser Leu Ser Asp He 
I960 1965 1970 1975 

GAC CAA GAA AAC AAC AAT AAA GAA AAT GAA CCT ATC AAA GAG ACT GAG 60 06 

Asp Gin Glu Asn Asn Asn Lys Glu Asn Glu Pro He Lys Glu Thr Glu 
1980 1985 1990 

CCC CCT GAC TCA CAG GGA GAA CCA AGT AAA CCT CAA GCA TCA GGC TAT 6054 
Pro Pro Asp Ser Gin Gly Glu Pro Ser Lys Pro Gin Ala Ser Gly Tyr 
1995 2000 2005 

GCT CCT AAA TCA TTT CAT GTT GAA GAT ACC CCA GTT TGT TTC TCA AGA 6102 
Ala Pro Lys Ser Phe His Val Glu Asp Thr Pro Val Cys Phe Ser Arg 
2010 2015 2020 

AAC AGT TCT CTC AGT TCT CTT AGT ATT GAC TCT GAA GAT GAC CTG TTG 6150 
Asn Ser Ser Leu Ser Ser Leu Ser He Asp Ser Glu Asp Asp Leu Leu 
2025 2030 2035 

CAG GAA TGT ATA AGC TCC GCA ATG CCA AAA AAG AAA AAG CCT TCA AGA 6198 
Gin Glu Cys He Ser Ser Ala Met Pro Lys Lys Lys Lys Pro Ser Arg 
2040 2045 2050 2055 

CTC AAG GGT GAT AAT GAA AAA CAT AGT CCC AGA AAT ATG GGT GGC ATA 6246 
Leu Lys Gly Asp Asn Glu Lys His Ser Pro Arg Asn Met Gly Gly He 
2060 2065 2070 

TTA GGT GAA GAT CTG ACA CTT GAT TTG AAA GAT ATA CAG AGA CCA GAT 62 94 

Leu Gly Glu Asp Leu Thr Leu Asp Leu Lys Asp He Gin Arg Pro Asp 
2075 2080 2085 

TCA GAA CAT GGT CTA TCC CCT GAT TCA GAA AAT TTT GAT TGG AAA GCT 6342 
Ser Glu His Gly Leu Ser Pro Asp Ser Glu Asn Phe Asp Trp Lys Ala 
2090 2095 2100 
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ATT CAG GAA GGT GCA AAT TCC ATA GTA AGT AGT TTA CAT CAA GCT GCT 63 90 

He Gin Glu Gly Ala Asn Ser He Val Ser Ser Leu His Gin Ala Ala 
2105 2110 2115 

GCT GCT GCA TGT TTA TCT AGA CAA GCT TCG TCT GAT TCA GAT TCC ATC 643 8 

Ala Ala Ala Cys Leu Ser Arg Gin Ala Ser Ser Asp Ser Asp Ser lie 
2120 2125 2130 2135 

CTT TCC CTG AAA TCA GGA ATC TCT CTG GGA TCA CCA TTT CAT CTT ACA 6486 
Leu Ser Leu Lys Ser Gly He Ser Leu Gly Ser Pro Phe His Leu Thr 
2140 2145 2150 



CCT GAT CAA GAA GAA AAA CCC TTT ACA AGT AAT AAA GGC CCA CGA ATT 6534 
Pro Asp Gin Glu Glu Lys Pro Phe Thr Ser Asn Lys Gly Pro Arg He 
2155 2160 2165 

CTA AAA CCA GGG GAG AAA AGT ACA TTG GAA ACT AAA AAG ATA GAA TCT 6582 
Leu Lys Pro Gly Glu Lys Ser Thr Leu Glu Thr Lys Lys He Glu Ser 
2170 2175 2180 

GAA AGT AAA GGA ATC AAA GGA GGA AAA AAA GTT TAT AAA AGT TTG ATT 663 0 

Glu Ser Lys Gly He Lys Gly Gly Lys Lys Val Tyr Lys Ser Leu He 
2185 2190 2195 

ACT GGA AAA GTT CGA TCT AAT TCA GAA ATT TCA GGC CAA ATG AAA CAG 6678 
Thr Gly Lys Val Arg Ser Asn Ser Glu He Ser Gly Gin Met Lys Gin 
2200 2205 2210 2215 

CCC CTT CAA GCA AAC ATG CCT TCA ATC TCT CGA GGC AGG ACA ATG ATT 6726 
Pro Leu Gin Ala Asn Met Pro Ser He Ser Arg Gly Arg Thr Met He 
2220 2225 2230 

CAT ATT CCA GGA GTT CGA AAT AGC TCC TCA AGT ACA AGT CCT GTT TCT 6774 
His He Pro Gly Val Arg Asn Ser Ser Ser Ser Thr Ser Pro Val Ser 
2235 2240 2245 

AAA AAA GGC CCA CCC CTT AAG ACT CCA GCC TCC AAA AGC CCT AGT GAA 6822 
Lys Lys Gly Pro Pro Leu Lys Thr Pro Ala Ser Lys Ser Pro Ser Glu 
2250 2255 2260 

GGT CAA ACA GCC ACC ACT TCT CCT AGA GGA GCC AAG CCA TCT GTG AAA 6870 
Gly Gin Thr Ala Thr Thr Ser Pro Arg Gly Ala Lys Pro Ser Val Lys 
2265 2270 2275 

TCA GAA TTA AGC CCT GTT GCC AGG CAG ACA TCC CAA ATA GGT GGG TCA 6918 
Ser Glu Leu Ser Pro Val Ala Arg Gin Thr Ser Gin He Gly Gly Ser 
2280 2285 2290 2295 

AGT AAA GCA CCT TCT AGA TCA GGA TCT AGA GAT TCG ACC CCT TCA AGA 6966 
Ser Lys Ala Pro Ser Arg Ser Gly Ser Arg Asp Ser Thr Pro Ser Arg 
2300 2305 2310 

CCT GCC CAG CAA CCA TTA AGT AGA CCT ATA CAG TCT CCT GGC CGA AAC 7014 
Pro Ala Gin Gin Pro Leu Ser Arg Pro He Gin Ser Pro Gly Arg Asn 
2315 2320 2325 
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TCA ATT TCC CCT GGT AGA AAT GGA ATA AGT CCT CCT AAC AAA TTA TCT 7062 
Ser He Ser Pro Gly Arg Asn Gly He Ser Pro Pro Asn Lys Leu Ser 
2330 2335 2340 

CAA CTT CCA AGG ACA TCA TCC CCT AGT ACT GCT TCA ACT AAG TCC TCA 7110 
Gin Leu Pro Arg Thr Ser Ser Pro Ser Thr Ala Ser Thr Lys Ser Ser 
2345 2350 2355 

GGT TCT GGA AAA ATG TCA TAT ACA TCT CCA GGT AGA CAG ATG AGC CAA 7158 
Gly Ser Gly Lys Met Ser Tyr Thr Ser Pro Gly Arg Gin Met Ser Gin 
2360 2365 2370 2375 

CAG AAC CTT ACC AAA CAA ACA GGT TTA TCC AAG AAT GCC AGT AGT ATT 7206 
Gin Asn Leu Thr Lys Gin Thr Gly Leu Ser Lys Asn Ala Ser Ser He 
2380 2385 2390 

CCA AGA AGT GAG TCT GCC TCC AAA GGA CTA AAT CAG ATG AAT AAT GGT 7254 
Pro Arg Ser Glu Ser Ala Ser Lys Gly Leu Asn Gin Met Asn Asn Gly 
2395 2400 2405 

AAT GGA GCC AAT AAA AAG GTA GAA CTT TCT AGA ATG TCT TCA ACT AAA 73 02 

Asn Gly Ala Asn Lys Lys Val Glu Leu Ser Arg Met Ser Ser Thr Lys 
2410 2415 2420 

TCA AGT GGA AGT GAA TCT GAT AGA TCA GAA AGA CCT GTA TTA GTA CGC 7350 
Ser Ser Gly Ser Glu Ser Asp Arg Ser Glu Arg Pro Val Leu Val Arg 
2425 2430 2435 

CAG TCA ACT TTC ATC AAA GAA GCT CCA AGC CCA ACC TTA AGA AGA AAA 73 98 

Gin Ser Thr Phe He Lys Glu Ala Pro Ser Pro Thr Leu Arg Arg Lys 
2440 2445 2450 2455 

TTG GAG GAA TCT GCT TCA TTT GAA TCT CTT TCT CCA TCA TCT AGA CCA 7446 
Leu Glu Glu Ser Ala Ser Phe Glu Ser Leu Ser Pro Ser Ser Arg Pro 
2460 2465 2470 

GCT TCT CCC ACT AGG TCC CAG GCA CAA ACT CCA GTT TTA AGT CCT TCC 7494 
Ala Ser Pro Thr Arg Ser Gin Ala Gin Thr Pro Val Leu Ser Pro Ser 
2475 2480 2485 

CTT CCT GAT ATG TCT CTA TCC ACA CAT TCG TCT GTT CAG GCT GGT GGA 7542 
Leu Pro Asp Met Ser Leu Ser Thr His Ser Ser Val Gin Ala Gly Gly 
2490 2495 2500 

TGG CGA AAA CTC CCA CCT AAT CTC AGT CCC ACT ATA GAG TAT AAT GAT 7590 
Trp Arg Lys Leu Pro Pro Asn Leu Ser Pro Thr He Glu Tyr Asn Asp 
2505 2510 2515 

GGA AGA CCA GCA AAG CGC CAT GAT ATT GCA CGG TCT CAT TCT GAA AGT 763 8 

Gly Arg Pro Ala Lys Arg His Asp He Ala Arg Ser His Ser Glu Ser 
252 ° 2525 2530 2535 

CCT TCT AGA CTT CCA ATC AAT AGG TCA GGA ACC TGG AAA CGT GAG CAC 7686 
Pro Ser Arg Leu Pro He Asn Arg Ser Gly Thr Trp Lys Arg Glu His 
2540 2545 2550 
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AGC AAA CAT TCA TCA TCC CTT CCT CGA GTA AGC ACT TGG AGA AGA ACT 7734 
Ser Lys His Ser Ser Ser Leu Pro Arg Val Ser Thr Trp Arg Arg Thr 
2555 2560 2565 

GGA AGT TCA TCT TCA ATT CTT TCT GCT TCA TCA GAA TCC AGT GAA AAA 7782 
Gly Ser Ser Ser Ser He Leu Ser Ala Ser Ser Glu Ser Ser Glu Lys 
2570 2575 2580 

GCA AAA AGT GAG GAT GAA AAA CAT GTG AAC TCT ATT TCA GGA ACC AAA 783 0 

Ala Lys Ser Glu Asp Glu Lys His Val Asn Ser He Ser Gly Thr Lys 
2585 2590 2595 

CAA AGT AAA GAA AAC CAA GTA TCC GCA AAA GGA ACA TGG AGA AAA ATA 7 878 

Gin Ser Lys Glu Asn Gin Val Ser Ala Lys Gly Thr Trp Arg Lys He 
2600 2605 2610 2615 

AAA GAA AAT GAA TTT TCT CCC ACA AAT AGT ACT TCT CAG ACC GTT TCC 792 6 

Lys Glu Asn Glu Phe Ser Pro Thr Asn Ser Thr Ser Gin Thr Val Ser 
2620 2625 2630 

TCA GGT GCT ACA AAT GGT GCT GAA TCA AAG ACT CTA ATT TAT CAA ATG 7974 
Ser Gly Ala Thr Asn Gly Ala Glu Ser Lys Thr Leu He Tyr Gin Met 
2635 2640 2645 

GCA CCT GCT GTT TCT AAA ACA GAG GAT GTT TGG GTG AGA ATT GAG GAC 8022 
Ala Pro Ala Val Ser Lys Thr Glu Asp Val Trp Val Arg He Glu Asp 
2650 2655 2660 

TGT CCC ATT AAC AAT CCT AGA TCT GGA AGA TCT CCC ACA GGT AAT ACT 8070 
Cys Pro He Asn Asn Pro Arg Ser Gly Arg Ser Pro Thr Gly Asn Thr 
2665 2670 2675 

CCC CCG GTG ATT GAC AGT GTT TCA GAA AAG GCA AAT CCA AAC ATT AAA 8118 
Pro Pro Val He Asp Ser Val Ser Glu Lys Ala Asn Pro Asn He Lys 
2680 2685 2690 2695 

GAT TCA AAA GAT AAT CAG GCA AAA CAA AAT GTG GGT AAT GGC AGT GTT 8166 
Asp Ser Lys Asp Asn Gin Ala Lys Gin Asn Val Gly Asn Gly Ser Val 
2700 2705 2710 

CCC ATG CGT ACC GTG GGT TTG GAA AAT CGC CTG ACC TCC TTT ATT CAG 8214 
Pro Met Arg Thr Val Gly Leu Glu Asn Arg Leu Thr Ser Phe He Gin 
2715 2720 2725 

GTG GAT GCC CCT GAC CAA AAA GGA ACT GAG ATA AAA CCA GGA CAA AAT 8262 
Val Asp Ala Pro Asp Gin Lys Gly Thr Glu He Lys Pro Gly Gin Asn 
2730 2735 2740 

AAT CCT GTC CCT GTA TCA GAG ACT AAT GAA AGT CCT ATA GTG GAA CGT 8310 
Asn Pro Val Pro Val Ser Glu Thr Asn Glu Ser Pro He Val Glu Arg 
2745 2750 2755 

ACC CCA TTC AGT TCT AGC AGC TCA AGC AAA CAC AGT TCA CCT AGT GGG 8358 
Thr Pro Phe Ser Ser Ser Ser Ser Ser Lys His Ser Ser Pro Ser Gly 
2760 2765 2770 2775 
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ACT GTT GCT GCC AGA GTG ACT CCT TTT AAT TAC AAC CCA AGC CCT AGG 8406 
Thr Val Ala Ala Arg Val Thr Pro Phe Asn Tyr Asn Pro Ser Pro Arg 
2780 2785 2790 

AAA AGC AGC GCA GAT AGC ACT TCA GCT CGG CCA TCT CAG ATC CCA ACT 84 54 
Lys Ser Ser Ala Asp Ser Thr Ser Ala Arg Pro Ser Gin He Pro Thr 
2795 2800 2805 

CCA GTG AAT AAC AAC ACA AAG AAG CGA GAT TCC AAA ACT GAC AGC ACA 8502 
Pro Val Asn Asn Asn Thr Lys Lys Arg Asp Ser Lys Thr Asp Ser Thr 
2810 2815 2820 

GAA TCC AGT GGA ACC CAA AGT CCT AAG CGC CAT TCT GGG TCT TAC CTT 8550 
Glu Ser Ser Gly Thr Gin Ser Pro Lys Arg His Ser Gly Ser Tyr Leu 
2825 2830 2835 

GTG ACA TCT GTT TAAAAGAGAG GAAGAATGAA ACTAAGAAAA TTCTATGTTA 8602 

Val Thr Ser Val 

2840 

ATTACAACTG CTATATAGAC ATTTTGTTTC AAATGAAACT TTAAAAGACT GAAAAATTTT 8662 

GTAAATAGGT TTGATTCTTG TTAGAGGGTT TTTGTTCTGG AAGCCATATT TGATAGTATA 8722 

CTTTGTCTTC ACTGGTCTTA TTTTGGGAGG CACTCTTGAT GGTTAGGAAA AAATAGAAAG 8782 

CCAAGTATGT TTGTACAGTA TGTTTTACAT GTATTTAAAG TAGCATCCCA TCCCAACTTC 8842 

CTTAATTATT GCTTGTCTAA AATAATGAAC ACTACAGATA GGAAATATGA TATATTGCTG 8902 

TTATCAATCA TTTCTAGATT ATAAACTGAC TAAACTTACA TCAGGGGAAA ATTGGTATTT 8962 

ATGCAAAAAA AAAATGTTTT TGTCCTTGTG AGTCCATCTA ACATCATAAT TAATCATGTG 9022 

GCTGTGAAAT TCACAGTAAT ATGGTTCCCG ATGAACAAGT TTACCCAGCC TGCTTTGCTT 9082 

ACTGCATGAA TGAAACTGAT GGTTCAATTT CAGAAGTAAT GATTAACAGT TATGTGGTCA 9142 

CATGATGTGC ATAGAGATAG CTACAGTGTA ATAATTTACA CTATTTTGTG CTCCAAACAA 92 02 

AACAAAAATC TGTGTAACTG TAAAACATTG AATGAAACTA TTTTACCTGA ACTAGATTTT 92 62 

ATCTGAAAGT AGGTAGAATT TTTGCTATGC TGTAATTTGT TGTATATTCT GGTATTTGAG 9322 

GTGAGATGGC TGCTCTTTAT TAATGAGACA TGAATTGTGT CTCAACAGAA ACTAAATGAA 93 82 

CATTTCAGAA TAAATTATTG CTGTATGTAA ACTGTTACTG AAATTGGTAT TTGTTTGAAG 9442 

GGTTTGTTTC ACATTTGTAT TAATTAATTG TTTAAAATGC CTCTTTTAAA AGCTTATATA 95 02 

AATTTTTTCT TCAGCTTCTA TGCATTAAGA GTAAAATTCC TCTTACTGTA ATAAAAACAT 9562 

TGAAGAAGAC TGTTGCCACT TAACCATTCC ATGCGTTGGC ACTT 9606 
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(2) INFORMATION FOR SEQ ID NO : 2 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 843 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Ala Ala Ser Tyr Asp Gin Leu Leu Lys Gin Val Glu Ala Leu 
15 10 15 

Lys Met Glu Asn Ser Asn Leu Arg Gin Glu Leu Glu Asp Asn Ser Asn 
20 25 30 

His Leu Thr Lys Leu Glu Thr Glu Ala Ser Asn Met Lys Glu Val Leu 
35 40 45 

Lys Gin Leu Gin Gly Ser lie Glu Asp Glu Ala Met Ala Ser Ser Gly 
50 55 60 

Gin lie Asp Leu Leu Glu Arg Leu Lys Glu Leu Asn Leu Asp Ser Ser 
65 70 75 80 

Asn Phe Pro Gly Val Lys Leu Arg Ser Lys Met Ser Leu Arg Ser Tyr 
85 90 95 

Gly Ser Arg Glu Gly Ser Val Ser Ser Arg Ser Gly Glu Cys Ser Pro 
100 105 110 

Val Pro Met Gly Ser Phe Pro Arg Arg Gly Phe Val Asn Gly Ser Arg 
115 120 125 

Glu Ser Thr Gly Tyr Leu Glu Glu Leu Glu Lys Glu Arg Ser Leu Leu 
130 135 140 

Leu Ala Asp Leu Asp Lys Glu Glu Lys Glu Lys Asp Trp Tyr Tyr Ala 
145 150 155 160 

Gin Leu Gin Asn Leu Thr Lys Arg lie Asp Ser Leu Pro Leu Thr Glu 
165 170 175 

Asn Phe Ser Leu Gin Thr Asp Leu Thr Arg Arg Gin Leu Glu Tyr Glu 
180 185 190 

Ala Arg Gin He Arg Val Ala Met Glu Glu Gin Leu Gly Thr Cys Gin 
195 200 205 

Asp Met Glu Lys Arg Ala Gin Arg Arg He Ala Arg He Gin Gin He 
210 215 220 

Glu Lys Asp He Leu Arg lie Arg Gin Leu Leu Gin Ser Gin Ala Thr 
225 230 235 240 
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Glu Ala Glu Arg Ser Ser Gin Asn Lys His Glu Thr Gly Ser His Asp 
245 250 255 

Ala Glu Arg Gin Asn Glu Gly Gin Gly Val Gly Glu He Asn Met Ala 
260 265 270 

Thr Ser Gly Asn Gly Gin Gly Ser Thr Thr Arg Met Asp His Glu Thr 
275 280 285 

Ala Ser Val Leu Ser Ser Ser Ser Thr His Ser Ala Pro Arg Arg Leu 
290 295 300 

Thr Ser His Leu Gly Thr Lys Val Glu Met Val Tyr Ser Leu Leu Ser 
305 310 315 320 

Met Leu Gly Thr His Asp Lys Asp Asp Met Ser Arg Thr Leu Leu Ala 
325 330 335 

Met Ser Ser Ser Gin Asp Ser Cys He Ser Met Arg Gin Ser Gly Cys 
340 345 350 

Leu Pro Leu Leu He Gin Leu Leu His Gly Asn Asp Lys Asp Ser Val 
355 360 365 

Leu Leu Gly Asn Ser Arg Gly Ser Lys Glu Ala Arg Ala Arg Ala Ser 
370 375 380 

Ala Ala Leu His Asn lie He His Ser Gin Pro Asp Asp Lys Arg Gly 
385 390 395 400 

Arg Arg Glu He Arg Val Leu His Leu Leu Glu Gin He Arg Ala Tyr 
405 410 415 

Cys Glu Thr Cys Trp Glu Trp Gin Glu Ala His Glu Pro Gly Met Asp 
420 425 430 

Gin Asp Lys Asn Pro Met Pro Ala Pro Val Glu His Gin He Cys Pro 
435 440 445 

Ala Val Cys Val Leu Met Lys Leu Ser Phe Asp Glu Glu His Arg His 
450 455 460 

Ala Met Asn Glu Leu Gly Gly Leu Gin Ala He Ala Glu Leu Leu Gin 
465 470 475 480 

Val Asp Cys Glu Met Tyr Gly Leu Thr Asn Asp His Tyr Ser He Thr 
485 490 495 

Leu Arg Arg Tyr Ala Gly Met Ala Leu Thr Asn Leu Thr Phe Gly Asp 
500 505 510 

Val Ala Asn Lys Ala Thr Leu Cys Ser Met Lys Gly Cys Met Arg Ala 
515 520 525 

Leu Val Ala Gin Leu Lys Ser Glu Ser Glu Asp Leu Gin Gin Val He 
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530 



535 



540 



Ala Ser Val Leu 
545 

Lys Thr Leu Arg 



Arg Asn Leu Ser 
550 

Glu Val Gly Ser 
565 



Trp Arg Ala Asp 
555 

Val Lys Ala Leu 
570 



Val Asn Ser Lys 
560 

Met Glu Cys Ala 
575 



Leu Glu Val Lys 
580 

Trp Asn Leu Ser 
595 

Val Asp Gly Ala 
610 

Gin Thr Asn Thr 
625 

Asn Val Ser Ser 



Arg Glu Asn Asn 
660 

Ser Leu Thr lie 
675 

Ala Arg Asn Pro 
690 

Ser Met Leu Lys 
705 

Gly Ser Ala Ala 



Tyr Lys Asp Ala 
740 

His Val Arg Lys 
755 

Leu Ser Glu Thr 
770 

His Arg Ser Lys 
785 

Phe Asp Thr Asn 



Gly Asn Met Thr 



Lys Glu Ser Thr 



Ala His Cys Thr 
600 

Leu Ala Phe Leu 
615 

Leu Ala He He 
630 

Leu He Ala Thr 
645 

Cys Leu Gin Thr 



Val Ser Asn Ala 
680 

Lys Asp Gin Glu 
695 

Asn Leu He His 
710 

Ala Leu Arg Asn 
725 

Asn He Met Ser 



Gin Lys Ala Leu 
760 

Phe Asp Asn He 
775 

Gin Arg His Lys 
790 

Arg His Asp Asp 
805 

Val Leu Ser Pro 



Leu Lys Ser Val 
585 

Glu Asn Lys Ala 



Val Gly Thr Leu 
620 

Glu Ser Gly Gly 
635 

Asn Glu Asp His 
650 

Leu Leu Gin His 
665 

Cys Gly Thr Leu 



Ala Leu Trp Asp 
700 

Ser Lys His Lys 
715 

Leu Met Ala Asn 
730 

Pro Gly Ser Ser 
745 

Glu Ala Glu Leu 



Asp Asn Leu Ser 
780 

Gin Ser Leu Tyr 
795 

Asn Arg Ser Asp 
810 

Tyr Leu Asn Thr 



Leu Ser Ala Leu 
590 

Asp He Cys Ala 
605 

Thr Tyr Arg Ser 



Gly He Leu Arg 
640 

Arg Gin He Leu 
655 

Leu Lys Ser His 
670 

Trp Asn Leu Ser 
685 

Met Gly Ala Val 



Met He Ala Met 
720 

Arg Pro Ala Lys 
735 

Leu Pro Ser Leu 
750 

Asp Ala Gin His 
765 

Pro Lys Ala Ser 



Gly Asp Tyr Val 
800 

Asn Phe Asn Thr 
815 

Thr Val Leu Pro 
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820 



825 



830 



Ser Ser Ser Ser 
835 

Asp Arg Ser Leu 
850 

Pro Ala Thr Glu 
865 

Ser Thr Thr Ala 



lie His Thr Ser 
900 

His Cys Val Thr 
915 



Ser Arg Gly Ser 
840 

Glu Arg Glu Arg 
855 

Asn Pro Gly Thr 
870 

Ala Gin He Ala 
885 

Gin Glu Asp Arg 



Asp Glu Arg Asn 
920 



Leu Asp Ser Ser 



Gly He Gly Leu 
860 

Ser Ser Lys Arg 
875 

Lys Val Met Glu 
890 

Ser Ser Gly Ser 
905 

Ala Leu Arg Arg 



Arg Ser Glu Lys 
845 

Gly Asn Tyr His 



Gly Leu Gin He 
880 

Glu Val Ser Ala 
895 

Thr Thr Glu Leu 
910 

Ser Ser Ala Ala 
925 



His Thr His Ser Asn Thr Tyr Asn Phe Thr Lys Ser Glu Asn Ser Asn 
930 935 940 

Arg Thr Cys Ser Met Pro Tyr Ala Lys Leu Glu Tyr Lys Arg Ser Ser 
945 950 955 960 

Asn Asp Ser Leu Asn Ser Val Ser Ser Asn Asp Gly Tyr Gly Lys Arg 
965 970 975 

Gly Gin Met Lys Pro Ser He Glu Ser Tyr Ser Glu Asp Asp Glu Ser 
980 985 990 

Lys Phe Cys Ser Tyr Gly Gin Tyr Pro Ala Asp Leu Ala His Lys He 
995 1000 1005 

His Ser Ala Asn His Met Asp Asp Asn Asp Gly Glu Leu Asp Thr Pro 
1010 1015 1020 

He Asn Tyr Ser Leu Lys Tyr Ser Asp Glu Gin Leu Asn Ser Gly Arg 
1025 1030 1035 1040 

Gin Ser Pro Ser Gin Asn Glu Arg Trp Ala Arg Pro Lys His He He 
1045 1050 1055 

Glu Asp Glu He Lys Gin Ser Glu Gin Arg Gin Ser Arg Asn Gin Ser 
1060 1065 1070 

Thr Thr Tyr Pro Val Tyr Thr Glu Ser Thr Asp Asp Lys His Leu Lys 
1075 1080 1085 

Phe Gin Pro His Phe Gly Gin Gin Glu Cys Val Ser Pro Tyr Arg Ser 
1090 1095 1100 

Arg Gly Ala Asn Gly Ser Glu Thr Asn Arg Val Gly Ser Asn His Gly 
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1105 1110 1115 1120 

lie Asn Gin Asn Val Ser Gin Ser Leu Cys Gin Glu Asp Asp Tyr Glu 
1125 1130 1135 

Asp Asp Lys Pro Thr Asn Tyr Ser Glu Arg Tyr Ser Glu Glu Glu Gin 
1140 1145 1150 

His Glu Glu Glu Glu Arg Pro Thr Asn Tyr Ser lie Lys Tyr Asn Glu 
1155 1160 1165 

Glu Lys Arg His Val Asp Gin Pro lie Asp Tyr Ser Leu Lys Tyr Ala 
1170 1175 1180 

Thr Asp He Pro Ser Ser Gin Lys Gin Ser Phe Ser Phe Ser Lys Ser 
1185 1190 1195 1200 

Ser Ser Gly Gin Ser Ser Lys Thr Glu His Met Ser Ser Ser Ser Glu 
1205 1210 1215 

Asn Thr Ser Thr Pro Ser Ser Asn Ala Lys Arg Gin Asn Gin Leu His 
1220 1225 1230 

Pro Ser Ser Ala Gin Ser Arg Ser Gly Gin Pro Gin Lys Ala Ala Thr 
1235 1240 1245 

Cys Lys Val Ser Ser He Asn Gin Glu Thr He Gin Thr Tyr Cys Val 
1250 1255 1260 

Glu Asp Thr Pro He Cys Phe Ser Arg Cys Ser Ser Leu Ser Ser Leu 
1265 1270 1275 1280 



Ser Ser Ala Glu Asp Glu He Gly 
1285 

Asp Ser Ala Asn Thr Leu Gin He 
1300 

Thr Arg Ser Ala Glu Asp Pro Val 
1315 1321 

His Pro Arg Thr Lys Ser Ser Arg 
1330 1335 

Glu Ser Ala Arg His Lys Ala Val 
1345 1350 

Pro Ser Lys Ser Gly Ala Gin Thr 
1365 

Val Gin Glu Thr Pro Leu Met Phe 
1380 

Leu Asp Ser Phe Glu Ser Arg Ser 



Cys Asn Gin Thr Thr Gin Glu Ala 
1290 1295 

Ala Glu He Lys Gly Lys He Gly 
1305 1310 

Ser Glu Val Pro Ala Val Ser Gin 
> 1325 

Leu Gin Gly Ser Ser Leu Ser Ser 
1340 

Glu Phe Pro Ser Gly Ala Lys Ser 
1355 1360 

Pro Lys Ser Pro Pro Glu His Tyr 
1370 1375 

Ser Arg Cys Thr Ser Val Ser Ser 
1385 1390 

He Ala Ser Ser Val Gin Ser Glu 
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1395 



1400 



1405 



Pro Cys Ser Gly Met Val Ser Gly He He Ser Pro Ser Asp Leu Pro 
1410 1415 1420 

Asp Ser Pro Gly Gin Thr Met Pro Pro Ser Arg Ser Lys Thr Pro Pro 
1425 1430 1435 1440 

Pro Pro Pro Gin Thr Ala Gin Thr Lys Arg Glu Val Pro Lys Asn Lys 
1445 1450 1455 

Ala Pro Thr Ala Glu Lys Arg Glu Ser Gly Pro Lys Gin Ala Ala Val 
1460 1465 1470 

Asn Ala Ala Val Gin Arg Val Gin Val Leu Pro Asp Ala Asp Thr Leu 
1475 1480 1485 

Leu His Phe Ala Thr Glu Ser Thr Pro Asp Gly Phe Ser Cys Ser Ser 
1490 1495 1500 

Ser Leu Ser Ala Leu Ser Leu Asp Glu Pro Phe He Gin Lys Asp Val 
1505 1510 1515 1520 

Glu Leu Arg He Met Pro Pro Val Gin Glu Asn Asp Asn Gly Asn Glu 
1525 1530 1535 

Thr Glu Ser Glu Gin Pro Lys Glu Ser Asn Glu Asn Gin Glu Lys Glu 
1540 1545 1550 

Ala Glu Lys Thr He Asp Ser Glu Lys Asp Leu Leu Asp Asp Ser Asp 
1555 1560 1565 

Asp Asp Asp He Glu He Leu Glu Glu Cys He He Ser Ala Met Pro 
1570 1575 1580 . 

Thr Lys Ser Ser Arg Lys Gly Lys Lys Pro Ala Gin Thr Ala Ser Lys 
1585 1590 1595 1600 

Leu Pro Pro Pro Val Ala Arg Lys Pro Ser Gin Leu Pro Val Tyr Lys 
1605 1610 1615 

Leu Leu Pro Ser Gin Asn Arg Leu Gin Pro Gin Lys His Val Ser Phe 
1620 1625 1630 



Thr Pro Gly Asp Asp Met Pro Arg Val Tyr Cys Val Glu Gly Thr Pro 
1635 1640 1645 

He Asn Phe Ser Thr Ala Thr Ser Leu Ser Asp Leu Thr He Glu Ser 
1650 1655 1660 

Pro Pro Asn Glu Leu Ala Ala Gly Glu Gly Val Arg Gly Gly Ala Gin 
l 66 ^ 1670 1675 1680 

Ser Gly Glu Phe Glu Lys Arg Asp Thr He Pro Thr Glu Gly Arg Ser 
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1685 1690 1695 

Thr Asp Glu Ala Gin Gly Gly Lys Thr Ser Ser Val Thr He Pro Glu 
1700 1705 1710 

Leu Asp Asp Asn Lys Ala Glu Glu Gly Asp He Leu Ala Glu Cys He 
1715 1720 1725 

Asn Ser Ala Met Pro Lys Gly Lys Ser His Lys Pro Phe Arg Val Lys 
1730 1735 1740 

Lys He Met Asp Gin Val Gin Gin Ala Ser Ala Ser Ser Ser Ala Pro 
1745 1750 1755 1760 

Asn Lys Asn Gin Leu Asp Gly Lys Lys Lys Lys Pro Thr Ser Pro Val 
1765 1770 1775 

Lys Pro He Pro Gin Asn Thr Glu Tyr Arg Thr Arg Val Arg Lys Asn 
1780 1785 1790 

Ala Asp Ser Lys Asn Asn Leu Asn Ala Glu Arg Val Phe Ser Asp Asn 
1795 1800 1805 

Lys Asp Ser Lys Lys Gin Asn Leu Lys Asn Asn Ser Lys Asp Phe Asn 
1810 1815 1820 

Asp Lys Leu Pro Asn Asn Glu Asp Arg Val Arg Gly Ser Phe Ala Phe 
1825 1830 1835 1840 

Asp Ser Pro His His Tyr Thr Pro He Glu Gly Thr Pro Tyr Cys Phe 
1845 1850 1855 

Ser Arg Asn Asp Ser Leu Ser Ser Leu Asp Phe Asp Asp Asp Asp Val 
1860 1865 1870 

Asp Leu Ser Arg Glu Lys Ala Glu Leu Arg Lys Ala Lys Glu Asn Lys 
1875 1880 1885 

Glu Ser Glu Ala Lys Val Thr Ser His Thr Glu Leu Thr Ser Asn Gin 
1890 1895 1900 

Gin Ser Ala Asn Lys Thr Gin Ala He Ala Lys Gin Pro He Asn Arg 
1905 1910 1915 1920 

Gly Gin Pro Lys Pro He Leu Gin Lys Gin Ser Thr Phe Pro Gin Ser 
1925 1930 1935 

Ser Lys Asp He Pro Asp Arg Gly Ala Ala Thr Asp Glu Lys Leu Gin 
1940 1945 1950 

Asn Phe Ala He Glu Asn Thr Pro Val Cys Phe Ser His Asn Ser Ser 
1955 1960 1965 

Leu Ser Ser Leu Ser Asp He Asp Gin Glu Asn Asn Asn Lys Glu Asn 
1970 1975 1980 
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Glu Pro He Lys Glu Thr Glu Pro Pro Asp Ser Gin Gly Glu Pro Ser 
1985 1990 1995 2000 

Lys Pro Gin Ala Ser Gly Tyr Ala Pro Lys Ser Phe His Val Glu Asp 
2005 2010 2015 

Thr Pro Val Cys Phe Ser Arg Asn Ser Ser Leu Ser Ser Leu Ser He 
2020 2025 2030 

Asp Ser Glu Asp Asp Leu Leu Gin Glu Cys He Ser Ser Ala Met Pro 
2035 2040 2045 

Lys Lys Lys Lys Pro Ser Arg Leu Lys Gly Asp Asn Glu Lys His Ser 
2050 2055 2060 

Pro Arg Asn Met Gly Gly He Leu Gly Glu Asp Leu Thr Leu Asp Leu 
2065 2070 2075 2080 

Lys Asp He Gin Arg Pro Asp Ser Glu His Gly Leu Ser Pro Asp Ser 
2085 2090 2095 

Glu Asn Phe Asp Trp Lys Ala He Gin Glu Gly Ala Asn Ser He Val 
2100 2105 2110 

Ser Ser Leu His Gin Ala Ala Ala Ala Ala Cys Leu Ser Arg Gin Ala 
2115 2120 2125 

Ser Ser Asp Ser Asp Ser He Leu Ser Leu Lys Ser Gly He Ser Leu 
2130 2135 2140 



Gly Ser Pro Phe His Leu Thr Pro 
2145 2150 

Ser Asn Lys Gly Pro Arg He Leu 
2165 

Glu Thr Lys Lys He Glu Ser Glu 
2180 



Asp Gin Glu Glu Lys Pro Phe Thr 
2155 2160 

Lys Pro Gly Glu Lys Ser Thr Leu 
2170 2175 

Ser Lys Gly He Lys Gly Gly Lys 
2185 2190 



Lys Val Tyr Lys Ser Leu He Thr Gly Lys Val Arg Ser Asn Ser Glu 
2195 2200 2205 

He Ser Gly Gin Met Lys Gin Pro Leu Gin Ala Asn Met Pro Ser He 
2210 2215 2220 



Ser Arg Gly Arg Thr Met He His He Pro Gly Val Arg Asn Ser Ser 
2225 2230 2235 2240 

Ser Ser Thr Ser Pro Val Ser Lys Lys Gly Pro Pro Leu Lys Thr Pro 
2245 2250 2255 

Ala Ser Lys Ser Pro Ser Glu Gly Gin Thr Ala Thr Thr Ser Pro Arg 
2260 2265 2270 
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Gly Ala Lys Pro Ser Val Lys Ser Glu Leu Ser Pro Val Ala Arg Gin 
2275 2280 2285 



Thr Ser Gin lie Gly Gly Ser Ser Lys Ala Pro Ser Arg Ser Gly Ser 
2290 2295 2300 

Arg Asp Ser Thr Pro Ser Arg Pro Ala Gin Gin Pro Leu Ser Arg Pro 
2305 2310 2315 2320 

lie Gin Ser Pro Gly Arg Asn Ser lie Ser Pro Gly Arg Asn Gly lie 
2325 2330 2335 



Ser Pro Pro Asn Lys Leu Ser Gin Leu Pro Arg Thr Ser Ser Pro Ser 
2340 2345 2350 

Thr Ala Ser Thr Lys Ser Ser Gly Ser Gly Lys Met Ser Tyr Thr Ser 
2355 2360 2365 

Pro Gly Arg Gin Met Ser Gin Gin Asn Leu Thr Lys Gin Thr Gly Leu 
2370 2375 2380 

Ser Lys Asn Ala Ser Ser lie Pro Arg Ser Glu Ser Ala Ser Lys Gly 
2385 2390 2395 2400 

Leu Asn Gin Met Asn Asn Gly Asn Gly Ala Asn Lys Lys Val Glu Leu 
2405 2410 2415 

Ser Arg Met Ser Ser Thr Lys Ser Ser Gly Ser Glu Ser Asp Arg Ser 
2420 2425 2430 

Glu Arg Pro Val Leu Val Arg Gin Ser Thr Phe lie Lys Glu Ala Pro 
2435 2440 2445 

Ser Pro Thr Leu Arg Arg Lys Leu Glu Glu Ser Ala Ser Phe Glu Ser 
2450 2455 2460 

Leu Ser Pro Ser Ser Arg Pro Ala Ser Pro Thr Arg Ser Gin Ala Gin 
2465 2470 2475 2480 

Thr Pro Val Leu Ser Pro Ser Leu Pro Asp Met Ser Leu Ser Thr His 
2485 2490 2495 

Ser Ser Val Gin Ala Gly Gly Trp Arg Lys Leu Pro Pro Asn Leu Ser 
2500 2505 2510 

Pro Thr lie Glu Tyr Asn Asp Gly Arg Pro Ala Lys Arg His Asp lie 
2515 2520 2525 

Ala Arg Ser His Ser Glu Ser Pro Ser Arg Leu Pro lie Asn Arg Ser 
2530 2535 2540 

Gly Thr Trp Lys Arg Glu His Ser Lys His Ser Ser Ser Leu Pro Arg 
2545 2550 2555 2560 
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Val Ser Thr Trp Arg Arg Thr Gly Ser Ser Ser Ser He Leu Ser Ala 
2565 2570 2575 



Ser Ser Glu Ser Ser Glu Lys Ala Lys Ser Glu Asp Glu Lys His Val 
2580 2585 2590 

Asn Ser He Ser Gly Thr Lys Gin Ser Lys Glu Asn Gin Val Ser Ala 
2595 2600 2605 

Lys Gly Thr Trp Arg Lys He Lys Glu Asn Glu Phe Ser Pro Thr Asn 
2610 2615 2620 

Ser Thr Ser Gin Thr Val Ser Ser Gly Ala Thr Asn Gly Ala Glu Ser 
2625 2630 2635 2640 

Lys Thr Leu He Tyr Gin Met Ala Pro Ala Val Ser Lys Thr Glu Asp 
2645 2650 2655 

Val Trp Val Arg He Glu Asp Cys Pro He Asn Asn Pro Arg Ser Gly 
2660 2665 2670 

Arg Ser Pro Thr Gly Asn Thr Pro Pro Val He Asp Ser Val Ser Glu 
2675 2680 2685 



Lys Ala Asn Pro Asn He Lys Asp Ser Lys Asp Asn Gin Ala Lys Gin 
2690 2695 2700 

Asn Val Gly Asn Gly Ser Val Pro Met Arg Thr Val Gly Leu Glu Asn 
2705 2710 2715 2720 

Arg Leu Thr Ser Phe He Gin Val Asp Ala Pro Asp Gin Lys Gly Thr 
2725 2730 2735 * 

Glu He Lys Pro Gly Gin Asn Asn Pro Val Pro Val Ser Glu Thr Asn 
2740 2745 2750 

Glu Ser Pro He Val Glu Arg Thr Pro Phe Ser Ser Ser Ser Ser Ser 
2755 2760 2765 

Lys His Ser Ser Pro Ser Gly Thr Val Ala Ala Arg Val Thr Pro Phe 
2770 2775 2780 

Asn Tyr Asn Pro Ser Pro Arg Lys Ser Ser Ala Asp Ser Thr Ser Ala 
2785 2790 2795 2800 

Arg Pro Ser Gin He Pro Thr Pro Val Asn Asn Asn Thr Lys Lys Arg 
2805 2810 2815 

Asp Ser Lys Thr Asp Ser Thr Glu Ser Ser Gly Thr Gin Ser Pro Lys 
2820 2825 2830 

Arg His Ser Gly Ser Tyr Leu Val Thr Ser Val 
2835 2840 
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(2) INFORMATION FOR SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3172 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: DP1(TB2) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .63 0 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

GCA GTC GCC GCT CCA GTC TAT CCG GCA CTA GGA ACA GCC CCG GGN GGC 4 8 

Ala Val Ala Ala Pro Val Tyr Pro Ala Leu Gly Thr Ala Pro Gly Gly 
15 10 15 

GAG ACG GTC CCC GCC ATG TCT GCG GCC ATG AGG GAG AGG TTC GAC CGG 96 
Glu Thr Val Pro Ala Met Ser Ala Ala Met Arg Glu Arg Phe Asp Arg 
20 25 30 

TTC CTG CAC GAG AAG AAC TGC ATG ACT GAC CTT CTG GCC AAG CTC GAG 144 
Phe Leu His Glu Lys Asn Cys Met Thr Asp Leu Leu Ala Lys Leu Glu 
35 40 45 

GCC AAA ACC GGC GTG AAC AGG AGC TTC ATC GCT CTT GGT GTC ATC GGA 192 
Ala Lys Thr Gly Val Asn Arg Ser Phe He Ala Leu Gly Val He Gly 
50 55 60 

CTG GTG GCC TTG TAC CTG GTG TTC GGT TAT GGA GCC TCT CTC CTC TGC 24 0 

Leu Val Ala Leu Tyr Leu Val Phe Gly Tyr Gly Ala Ser Leu Leu Cys 
65 70 75 80 

AAC CTG ATA GGA TTT GGC TAC CCA GCC TAC ATC TCA ATT AAA GCT ATA 28 8 

Asn Leu He Gly Phe Gly Tyr Pro Ala Tyr He Ser He Lys Ala He 
85 90 95 

GAG AGT CCC AAC AAA GAA GAT GAT ACC CAG TGG CTG ACC TAC TGG GTA 336 
Glu Ser Pro Asn Lys Glu Asp Asp Thr Gin Trp Leu Thr Tyr Trp Val 
100 105 110 

GTG TAT GGT GTG TTC AGC ATT GCT GAA TTC TTC TCT GAT ATC TTC CTG 3 84 

Val Tyr Gly Val Phe Ser He Ala Glu Phe Phe Ser Asp He Phe Leu 
115 120 125 

TCA TGG TTC CCC TTC TAC TAC ATG CTG AAG TGT GGC TTC CTG TTG TGG 432 
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Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lys Cys Gly Phe Leu Leu Trp 
130 135 140 

TGC ATG GCC CCG AGC CCT TCT AAT GGG GCT GAA CTG CTC TAC AAG CGC 480 
Cys Met Ala Pro Ser Pro Ser Asn Gly Ala Glu Leu Leu Tyr Lys Arg 
145 150 155 160 

ATC ATC CGT CCT TTC TTC CTG AAG CAC GAG TCC CAG ATG GAC AGT GTG 528 
lie lie Arg Pro Phe Phe Leu Lys His Glu Ser Gin Met Asp Ser Val 
165 170 175 

GTC AAG GAC CTT AAA GAC AAG TCC AAA GAG ACT GCA GAT GCC ATC ACT 576 
Val Lys Asp Leu Lys Asp Lys Ser Lys Glu Thr Ala Asp Ala lie Thr 
180 185 190 

AAA GAA GCG AAG AAA GCT ACC GTG AAT TTA CTG GGT GAA GAA AAG AAG 624 
Lys Glu Ala Lys Lys Ala Thr Val Asn Leu Leu Gly Glu Glu Lys Lys 
195 200 205 

AGC ACC TAAACCAGAC TAAACCAGAC TGGATGGAAA CTTCCTGCCC TCTCTGTACC 680 
Ser Thr 
210 

TTCCTACTGG AGCTTGATGT TATATTAGGG ACTGTGGTAT AATTATTTTA ATAATGTTGC 74 0 

CTTGGAAACA TTTTTGAGAT ATTAAAGATT GGAATGTGTT GTAAGTTTCT TTGCTTACTT 800 

TTACTGTCTA TATATATAGG GAGCACTTTA AACTTAATGC AGTGGGCAGT GTCCACGTTT 860 

TTGGAAAATG TATTTTGCCT CTGGGTAGGA AAAGATGTAT GTTGCTATCC TGCAGGAAAT 92 0 

ATAAACTTAA AATAAAATTA TATACCCCAC AGGCTGTGTA CTTTACTGGG CTCTCCCTGC 980 

ACGSATTTTC TCTGTAGTTA CATTTAGGRT AATCTTTATG GTTCTACTTC CTRTAATGTA 1040 

CAATTTTATA TAATTCNGRA ATGTTTTTAA TGTATTTGTG CACATGTACA TATGGAAATG 1100 

TTACTGTCTG ACTACANCAT GCATCATGCT CATGGGGAGG GAGCAGGGGA AGGTTGTATG 1160 

TGTCATTTAT AACTTCTGTA C AGTAAGAC C ACCTGCCAAA AGCTGGAGGA ACCATTGTGC 122 0 

TGGTGTGGTC TACTAAATAA TACTTTAGGA AATACGTGAT TAATATGCAA GTGAACAAAG 1280 

TGAGAAATGA AATCGAATGG AGATTGGCCT GGTTGTTTCC GTAGTATATG GCATATGAAT 134 0 

ACCAGGATAG CTTTATAAAG CAGTTAGTTA GTTAGTTACT CACTCTAGTG ATAAATCGGG 1400 

AAATTTACAC ACACACACAC ACACACACAC ACACACACAC ACACACACAC ACACACACAG 146 0 

AGTACCCTGT AACTCTCAAT TCCCTGAAAA ACTAGTAATA CTGTCTTATC TGCTATAAAC 152 0 

TTTACATATT TGTCTATTGT CAAGATGCTA CANTGGAMNC CATTTCTGGT TTTATCTTCA 1580 

NAGSGGAGAN ACATGTTGAT TTAGTCTTCT TTCCCAATCT TCTTTTTTAA MCCAGTTTNA 1640 
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GGMNCTTCTG RAGATTTGYC CACCTCTGAT TACATGTATG TTCTYGTTTG TATCATKAGC 1700 

AACAACATGC TAATGRCGAC ACCTAGCTCT RAGMGCAATT CTGGGAGANT GARAGGNWGT 1760 

ATARAGTMNC CCATAATCTG CTTGGCAATA GTTAAGTCAA TCTATCTTCA GTTTTTCTCT 1820 

GGCCTTTAAG GTCAAACACA AGAGGCTTCC CTAGTTTACA AGTCAGAGTC ACTTGTAGTC 1880 

CATTTAAATG CCCTCATCCG TATTCTTTGT GTTGATAAGC TGCACAKGAC TACATAGTAA 194 0 

GTACAGANCA GTAAAGTTAA NNCGGATGTC TCCATTGATC TGCCAANTCG NTATAGAGAG 2 000 

CAATTTGTCT GGACTAGAAA ATCTGAGTTT TACACCATAC TGTTAAGAGT CCTTTTGAAT 2 060 

T AAAC TAG AC TAAAACAAGT GTATAACTAA ACTAACAAGA TTAAATATCC AGCCAGTACA 212 0 

GTATTTTTTA AGGCAAATAA AGATGATTAG CTCACCTTGA GNTAACAATC AGGTAAGATC 2180 

ATNACAATGT CTCATGATGT NAANAATATT AAAGATATCA ATACTAAGTG ACAGTATCAC 2240 

NNCTAATATA ATATGGATCA GAGCATTTAT TTTGGGGAGG AAAACAGTGG TGATTACCGG 23 00 

CATTTTATTA AACTTAAAAC TTTGTAGAAA GCAAACAAAA TTGTTCTTGG GAGAAAATCA 2360 

ACTTTTAGAT TAAAAAAATT TTAAGTAWCT AGGAGTATTT AAATCCTTTT CCCATAAATA 242 0 

AAAGTACAGT TTTCTTGGTG GCAGAATGAA AATCAGCAAC NTCTAGCATA TAGACTATAT 24 80 

AATCAGATTG ACAGCATATA GAATATATTA TCAGACAAGA TGAGGAGGTA CAAAAGTTAC 254 0 

TATTGCTCAT AATGACTTAC AGGCTAAAAN TAGNTNTAAA ATACTATATT AAATTCTGAA 2600 

TGCAATTTTT TTTTGTTCCC TTGAGACCAA AATTTAAGTT AACTGTTGCT GGCAGTCTAA 2 660 

GTGTAAATGT TAACAGCAGG AGAAGTTAAG AATTGAGCAG TTCTGTTGCA TGATTTCCCA 2 72 0 

AATGAAATAC TGCCTTGGCT AGAGTTTGAA AAACTAATTG AGCCTGTGCC TGGCTAGAAA 2 780 

ACAAGCGTTT ATTTGAATGT GAATAGTGTT TCAAAGGTAT GTAGTTACAG AATTCCTACC 2 84 0 

AAACAGCTTA AATTCTTCAA GAAAGAATTC CTGCAGCAGT TATTCCCTTA CCTGAAGGCT 2900 

TCAATCATTT GGATCAACAA CTGCTACTCT CGGGAAGACT CCTCTACTCA CAGCTGAAGA 2 960 

AAATGAGCAC ACCCTTCACA CTGTTATCAC CTATCCTGAA GATGTGATAC ACTGAATGGA 302 0 

AATAAATAGA TGTAAATAAA ATTGAGWTCT CATTTAAAAA AAACCATGTG CCCAATGGGA 3080 

AAATGACCTC ATGTTGTGGT TTAAACAGCA ACTGCACCCA CTAGCACAGC CCATTGAGCT 3140 

ANCCTATATA TACATCTCTG TCAGTGCCCC TC 3172 

(2) INFORMATION FOR SEQ ID NO : 4 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Ala Val Ala Ala Pro Val Tyr Pro Ala Leu Gly Thr Ala Pro Gly Gly 
15 10 15 

Glu Thr Val Pro Ala Met Ser Ala Ala Met Arg Glu Arg Phe Asp Arg 
20 25 30 

Phe Leu His Glu Lys Asn Cys Met Thr Asp Leu Leu Ala Lys Leu Glu 
35 40 45 

Ala Lys Thr Gly Val Asn Arg Ser Phe He Ala Leu Gly Val He Gly 
50 55 60 

Leu Val Ala Leu Tyr Leu Val Phe Gly Tyr Gly Ala Ser Leu Leu Cys 
65 70 75 . 80 

Asn Leu He Gly Phe Gly Tyr Pro Ala Tyr He Ser He Lys Ala He 
85 90 95 

Glu Ser Pro Asn Lys Glu Asp Asp Thr Gin Trp Leu Thr Tyr Trp Val 
100 105 110 

Val Tyr Gly Val Phe Ser He Ala Glu Phe Phe Ser Asp He Phe Leu 
115 120 125 

Ser Trp Phe Pro Phe Tyr Tyr Met Leu Lys Cys Gly Phe Leu Leu Trp 
130 135 140 

Cys Met Ala Pro Ser Pro Ser Asn Gly Ala Glu Leu Leu Tyr Lys Arg 
145 150 155 160 

He He Arg Pro Phe Phe Leu Lys His Glu Ser Gin Met Asp Ser Val 
165 170 , 175 

Val Lys Asp Leu Lys Asp Lys Ser Lys Glu Thr Ala Asp Ala He Thr 
180 185 190 

Lys Glu Ala Lys Lys Ala Thr Val Asn Leu Leu Gly Glu Glu Lys Lys 
195 200 205 

Ser Thr 
210 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 434 amino acids 



29 



(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE : 

(B) CLONE: TBI 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Val Ala Pro Val Val Val Gly Ser Gly Arg Ala Pro Arg His Pro Ala 
15 10 15 

Pro Ala Ala Met His Pro Arg Arg Pro Asp Gly Phe Asp Gly Leu Gly 
20 25 30 

Tyr Arg Gly Gly Ala Arg Asp Glu Gin Gly Phe Gly Gly Ala Phe Pro 
35 40 45 

Ala Arg Ser Phe Ser Thr Gly Ser Asp Leu Gly His Trp Val Thr Thr 
50 55 60 

Pro Pro Asp lie Pro Gly Ser Arg Asn Leu His Trp Gly Glu Lys Ser 
65 70 75 80 

Pro Pro Tyr Gly Val Pro Thr Thr Ser Thr Pro Tyr Glu Gly Pro Thr 
85 90 95 

Glu Glu Pro Phe Ser Ser Gly Gly Gly Gly Ser Val Gin Gly Gin Ser 
100 105 110 

Ser Glu Gin Leu Asn Arg Phe Ala Gly Phe Gly lie Gly Leu Ala Ser 
115 120 125 

Leu Phe Thr Glu Asn Val Leu Ala His Pro Cys lie Val Leu Arg Arg 
130 135 140 

Gin Cys Gin Val Asn Tyr His Ala Gin His Tyr His Leu Thr Pro Phe 
145 150 155 160 

Thr Val lie Asn lie Met Tyr Ser Phe Asn Lys Thr Gin Gly Pro Arg 
165 170 175 

Ala Leu Trp Lys Gly Met Gly Ser Thr Phe He Val Gin Gly Val Thr 
180 185 190 

Leu Gly Ala Glu Gly He He Ser Glu Phe Thr Pro Leu Pro Arg Glu 
195 200 205 
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Val Leu His Lys Trp 
210 

Lys Ser Leu Thr Tyr 
225 

He Glu Thr Val Gin 
245 

Glu Cys Val Lys Glu 
260 

His Ser Lys Arg Leu 
275 

Leu His Gly Val Leu 
290 

Val Leu Leu He Leu 
305 

Ser Thr Ser Pro Val 
325 

He Ala Asn Phe Ala 
340 

Leu Glu Thr Val Leu 
355 

He Asp Asn Thr Asp 
370 



Ser Pro Lys Gin lie Gly 
215 

Val Val Ala Met Pro Phe 

230 235 

Ser Glu He He Arg Asp 
250 

Gly He Gly Arg Val He 
265 

Leu Pro Leu Leu Ser Leu 
280 

His Tyr He He Ser Ser 
295 

Lys Arg Lys Thr Tyr Asn 
310 315 

Gin Ser Met Leu Asp Ala 
330 

Ala Ser Leu Cys Ser Asp 
345 

His Arg Leu His He Gin 
360 

Leu Gly Tyr Glu Val Leu 
375 



Glu His Leu Leu Leu 
220 

Tyr Ser Ala Ser Leu 
240 

Asn Thr Gly lie Leu 
255 

Gly Met Gly Val Pro 
270 

He Phe Pro Thr Val 
285 

Val He Gin Lys Phe 
300 

Ser His Leu Ala Glu 
320 

Tyr Phe Pro Glu Leu 
335 

Val He Leu Tyr Pro 
350 

Gly Thr Arg Thr He 
365 

Pro He Asn Thr Gin 
380 



Tyr Glu Gly Met 
385 

Val Phe Gly Phe 

Leu His Ala Ala 
420 

Leu Gin 



Arg Asp Cys He 
390 

Tyr Lys Gly Phe 
405 

Val Leu Gin He 



Asn Thr He Arg 
395 

Gly Ala Val lie 
410 

Thr Lys He He 
425 



Gin Glu Glu Gly 
400 

He Gin Tyr Thr 
415 

Tyr Ser Thr Leu 
430 



INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 185 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: YS-39(TB2) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Glu Leu Arg Arg Phe Asp Arg Phe Leu His Glu Lys Asn Cys Met Thr 
15 10 15 

Asp Leu Leu Ala Lys Leu Glu Ala Lys Thr Gly Val Asn Arg Ser Phe 
20 25 30 

lie Ala Leu Gly Val He Gly Leu Val Ala Leu Tyr Leu Val Phe Gly 
35 40 45 

Tyr Gly Ala Ser Leu Leu Cys Asn Leu He Gly Phe Gly Tyr Pro Ala 
50 55 60 

Tyr He Ser He Lys Ala He Glu Ser Pro Asn Lys Glu Asp Asp Thr 
65 70 75 80 

Gin Trp Leu Thr Tyr Trp Val Val Tyr Gly Val Phe Ser He Ala Glu 
85 90 95 

Phe Phe Ser Asp He Phe Leu Ser Trp Phe Pro Phe Tyr Tyr He Leu 
100 105 110 

Lys Cys Gly Phe Leu Leu Trp Cys Met Ala Pro Ser Pro Ser Asn Gly 
115 120 125 

Ala Glu Leu Leu Tyr Lys Arg He He Arg Pro Phe Phe Leu Lys His 
130 135 140 

Glu Ser Gin Met Asp Ser Val Val Lys Asp Leu Lys Asp Lys Ala Lys 
145 150 155 160 

Glu Thr Ala Asp Ala He Thr Lys Glu Ala Lys Lys Ala Thr Val Asn 
165 170 175 

Leu Leu Gly Glu Glu Lys Lys Ser Thr 
180 185 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 843 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: APC 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Ala Ala Ala Ser Tyr Asp Gin Leu Leu Lys Gin Val Glu Ala Leu 
15 10 15 

Lys Met Glu Asn Ser Asn Leu Arg Gin Glu Leu Glu Asp Asn Ser Asn 
20 25 30 

His Leu Thr Lys Leu Glu Thr Glu Ala Ser Asn Met Lys Glu Val Leu 
35 40 45 

Lys Gin Leu Gin Gly Ser He Glu Asp Glu Ala Met Ala Ser Ser Gly 
50 55 60 

Gin He Asp Leu Leu Glu Arg Leu Lys Glu Leu Asn Leu Asp Ser Ser 
65 70 75 80 

Asn Phe Pro Gly Val Lys Leu Arg Ser Lys Met Ser Leu Arg Ser Tyr 
85 90" 95 

Gly Ser Arg Glu Gly Ser Val Ser Ser Arg Ser Gly Glu Cys Ser Pro 
100 105 HO 

Val Pro Met Gly Ser Phe Pro Arg Arg Gly Phe Val Asn Gly Ser Arg 
115 120 125 

Glu Ser Thr Gly Tyr Leu Glu Glu Leu Glu Lys Glu Arg Ser Leu Leu 
130 135 140 

Leu Ala Asp Leu Asp Lys Glu Glu Lys Glu Lys Asp Trp Tyr Tyr Ala 
145 150 155 160 

Gin Leu Gin Asn Leu Thr Lys Arg He Asp Ser Leu Pro Leu Thr Glu 
165 170 175 

Asn Phe Ser Leu Gin Thr Asp Met Thr Arg Arg Gin Leu Glu Tyr Glu 
180 185 190 

Ala Arg Gin He Arg Val Ala Met Glu Glu Gin Leu Gly Thr Cys Gin 
195 200 205 

Asp Met Glu Lys Arg Ala Gin Arg Arg He Ala Arg He Gin Gin He 
210 215 220 

Glu Lys Asp He Leu Arg He Arg Gin Leu Leu Gin Ser Gin Ala Thr 
225 230 235 240 
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Glu Ala Glu Arg 



Ala Glu Arg Gin 
260 

Thr Ser Gly Asn 
275 

Ala Ser Val Leu 
290 

Thr Ser His Leu 
305 

Met Leu Gly Thr 



Met Ser Ser Ser 
340 

Leu Pro Leu Leu 
355 

Leu Leu Gly Asn 
370 

Ala Ala Leu His 
385 



Ser Ser Gin Asn 
245 

Asn Glu Gly Gin 



Gly Gin Gly Ser 
280 

Ser Ser Ser Ser 
295 

Gly Thr Lys Val 
310 

His Asp Lys Asp 
325 

Gin Asp Ser Cys 



lie Gin Leu Leu 
360 

Ser Arg Gly Ser 
375 

Asn lie lie His 
390 



Lys His Glu Thr 
250 

Gly Val Gly Glu 
265 

Thr Thr Arg Met 



Thr His Ser Ala 
300 

Glu Met Val Tyr 
315 

Asp Met Ser Arg 
330 

lie Ser Met Arg 
345 

His Gly Asn Asp 



Lys Glu Ala Arg 
380 

Ser Gin Pro Asp 
395 



Gly Ser His Asp 
255 

lie Asn Met Ala 
270 

Asp His Glu Thr 
285 

Pro Arg Arg Leu 



Ser Leu Leu Ser 
320 

Thr Leu Leu Ala 
335 

Gin Ser Gly Cys 
350 

Lys Asp Ser Val 
365 

Ala Arg Ala Ser 



Asp Lys Arg Gly 
400 



Arg Arg Glu lie 



Cys Glu Thr Cys 
420 

Gin Asp Lys Asn 
435 

Ala Val Cys Val 
450 

Ala Met Asn Glu 
465 

Val Asp Cys Glu 



Leu Arg Arg Tyr 
500 



Arg Val Leu His 
405 

Trp Glu Trp Gin 



Pro Met Pro Ala 
440 

Leu Met Lys Leu 
455 

Leu Gly Gly Leu 
470 

Met Tyr Gly Leu 
485 

Ala Gly Met Ala 



Leu Leu Glu Gin 
410 

Glu Ala His Glu 
425 

Pro Val Glu His 



Ser Phe Asp Glu 
460 

Gin Ala He Ala 
475 

Thr Asn Asp His 
490 

Leu Thr Asn Leu 
505 



He Arg Ala Tyr 
415 

Pro Gly Met Asp 
430 

Gin He Cys Pro 
445 

Glu His Arg His 



Glu Leu Leu Gin 
480 

Tyr Ser He Thr 
495 

Thr Phe Gly Asp 
510 



Val Ala Asn Lys Ala Thr Leu Cys Ser Met Lys Gly Cys Met Arg Ala 
515 520 525 



Leu Val Ala Gin Leu Lys Ser Glu Ser Glu Asp Leu Gin Gin Val He 
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530 



535 



540 



Ala Ser Val Leu 
545 

Lys Thr Leu Arg 



Leu Glu Val Lys 
580 

Trp Asn Leu Ser 
595 



Arg Asn Leu Ser 
550 

Glu Val Gly Ser 
565 

Lys Glu Ser Thr 



Ala His Cys Thr 
600 



Trp Arg Ala Asp 
555 

Val Lys Ala Leu 
570 

Leu Lys Ser Val 
585 

Glu Asn Lys Ala 



Val Asn Ser Lys 
560 

Met Glu Cys Ala 
575 

Leu Ser Ala Leu 
590 

Asp lie Cys Ala 
605 



Val Asp Gly Ala Leu Ala Phe Leu Val Gly Thr Leu Thr Tyr Arg Ser 
610 615 620 

Gin Thr Asn Thr Leu Ala He He Glu Ser Gly Gly Gly He Leu Arg 
625 630 635 640 

Asn Val Ser Ser Leu He Ala Thr Asn Glu Asp His Arg Gin He Leu 
645 650 655 

Arg Glu Asn Asn Cys Leu Gin Thr Leu Leu Gin His Leu Lys Ser His 
660 665 670 

Ser Leu Thr He Val Ser Asn Ala Cys Gly Thr Leu Trp Asn Leu Ser 
675 680 685 

Ala Arg Asn Pro Lys Asp Gin Glu Ala Leu Trp Asp Met Gly Ala Val 
690 695 700 

Ser Met Leu Lys Asn Leu He His Ser Lys His Lys Met He Ala Met 
705 710 715 720 

Gly Ser Ala Ala Ala Leu Arg Asn Leu Met Ala Asn Arg Pro Ala Lys 
725 730 735 

Tyr Lys Asp Ala Asn He Met Ser Pro Gly Ser Ser Leu Pro Ser Leu 
740 745 750 

His Val Arg Lys Gin Lys Ala Leu Glu Ala Glu Leu Asp Ala Gin His 
755 760 765 

Leu Ser Glu Thr Phe Asp Asn He Asp Asn Leu Ser Pro Lys Ala Ser 
770 775 780 

His Arg Ser Lys Gin Arg His Lys Gin Ser Leu Tyr Gly Asp Tyr Val 
785 790 795 800 

Phe Asp Thr Asn Arg His Asp Asp Asn Arg Ser Asp Asn Phe Asn Thr 
805 810 815 

Gly Asn Met Thr Val Leu Ser Pro Tyr Leu Asn Thr Thr Val Leu Pro 
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820 



825 



830 



Ser Ser Ser Ser 
835 

Asp Arg Ser Leu 
850 

Pro Ala Thr Glu 
865 

Ser Thr Thr Ala 



lie His Thr Ser 
900 

His Cys Val Thr 
915 

His Thr His Ser 
930 

Arg Thr Cys Ser 
945 



Ser Arg Gly Ser 
840 

Glu Arg Glu Arg 
855 

Asn Pro Gly Thr 
870 

Ala Gin He Ala 
885 

Gin Glu Asp Arg 



Asp Glu Arg Asn 
920 

Asn Thr Tyr Asn 
935 

Met Pro Tyr Ala 
950 



Leu Asp Ser Ser 



Gly He Gly Leu 
860 

Ser Ser Lys Arg 
875 

Lys Val Met Glu 
890 

Ser Ser Gly Ser 
905 

Ala Leu Arg Arg 



Phe Thr Lys Ser 
940 

Lys Leu Glu Tyr 
955 



Arg Ser Glu Lys 
845 

Gly Asn Tyr His 



Gly Leu Gin He 
880 

Glu Val Ser Ala 
895 

Thr Thr Glu Leu 
910 

Ser Ser Ala Ala 
925 

Glu Asn Ser Asn 



Lys Arg Ser Ser 
960 



Asn Asp Ser Leu Asn Ser 
965 

Gly Gin Met Lys Pro Ser 
980 

Lys Phe Cys Ser Tyr Gly 
995 

His Ser Ala Asn His Met 
1010 

He Asn Tyr Ser Leu Lys 
1025 103< 

Gin Ser Pro Ser Gin Asn 
1045 

Glu Asp Glu He Lys Gin 
1060 

Thr Thr Tyr Pro Val Tyr 
1075 

Phe Gin Pro His Phe Gly 
1090 

Arg Gly Ala Asn Gly Ser 



Val Ser Ser Ser Asp 
970 

He Glu Ser Tyr Ser 
985 

Gin Tyr Pro Ala Asp 
1000 

Asp Asp Asn Asp Gly 
1015 

Tyr Ser Asp Glu Gin 
) 103! 

Glu Arg Trp Ala Arg 
1050 

Ser Glu Gin Arg Gin 
1065 

Thr Glu Ser Thr Asp 
1080 

Gin Gin Glu Cys Val 
1095 

Glu Thr Asn Arg Val 



Gly Tyr Gly Lys Arg 
975 

Glu Asp Asp Glu Ser 
990 

Leu Ala His Lys He 
1005 

Glu Leu Asp Thr Pro 
1020 

Leu Asn Ser Gly Arg 
i 1040 

Pro Lys His He He 
1055 

Ser Arg Asn Gin Ser 
1070 

Asp Lys His Leu Lys 
1085 

Ser Pro Tyr Arg Ser 
1100 

Gly Ser Asn His Gly 
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1105 



1110 



1115 



1120 



lie Asn Gin Asn Val Ser Gin Ser Leu Cys Gin Glu Asp Asp Tyr Glu 
1125 1130 1135 

Asp Asp Lys Pro Thr Asn Tyr Ser Glu Arg Tyr Ser Glu Glu Glu Gin 
1140 1145 1150 

His Glu Glu Glu Glu Arg Pro Thr Asn Tyr Ser lie Lys Tyr Asn Glu 
1155 1160 1165 

Glu Lys Arg His Val Asp Gin Pro lie Asp Tyr Ser Leu Lys Tyr Ala 
1170 1175 1180 

Thr Asp lie Pro Ser Ser Gin Lys Gin Ser Phe Ser Phe Ser Lys Ser 
1185 1190 1195 1200 

Ser Ser Gly Gin Ser Ser Lys Thr Glu His Met Ser Ser Ser Ser Glu 
1205 1210 1215 

Asn Thr Ser Thr Pro Ser Ser Asn Ala Lys Arg Gin Asn Gin Leu His 
1220 1225 1230 

Pro Ser Ser Ala Gin Ser Arg Ser Gly Gin Pro Gin Lys Ala Ala Thr 
1235 1240 1245 

Cys Lys Val Ser Ser lie Asn Gin Glu Thr lie Gin Thr Tyr Cys Val 
1250 1255 1260 

Glu Asp Thr Pro lie Cys Phe Ser Arg Cys Ser Ser Leu Ser Ser Leu 
1265 1270 1275 1280 

Ser Ser Ala Glu Asp Glu lie Gly Cys Asn Gin Thr Thr Gin Glu Ala 
1285 1290 1295 

Asp Ser Ala Asn Thr Leu Gin lie Ala Glu lie Lys Glu Lys lie Gly 
1300 1305 1310 



Thr Arg Ser Ala Glu Asp Pro Val 
1315 132> 

His Pro Arg Thr Lys Ser Ser Arg 
1330 1335 

Glu Ser Ala Arg His Lys Ala Val 
1345 1350 

Pro Ser Lys Ser Gly Ala Gin Thr 
1365 

Val Gin Glu Thr Pro Leu Met Phe 
1380 

Leu Asp Ser Phe Glu Ser Arg Ser 



Ser Glu Val Pro Ala Val Ser Gin 
1325 

Leu Gin Gly Ser Ser Leu Ser Ser 
1340 

Glu Phe Ser Ser Gly Ala Lys Ser 
1355 1360 

Pro Lys Ser Pro Pro Glu His Tyr 
1370 1375 

Ser Arg Cys Thr Ser Val Ser Ser 
1385 1390 

lie Ala Ser Ser Val Gin Ser Glu 
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1395 



1400 



1405 



Pro Cys Ser Gly Met Val Ser Gly lie lie Ser Pro Ser Asp Leu Pro 
1410 1415 1420 

Asp Ser Pro Gly Gin Thr Met Pro Pro Ser Arg Ser Lys Thr Pro Pro 
1425 1430 1435 1440 

Pro Pro Pro Gin Thr Ala Gin Thr Lys Arg Glu Val Pro Lys Asn Lys 
1445 1450 1455 

Ala Pro Thr Ala Glu Lys Arg Glu Ser Gly Pro Lys Gin Ala Ala Val 
1460 1465 1470 

Asn Ala Ala Val Gin Arg Val Gin Val Leu Pro Asp Ala Asp Thr Leu 
1475 1480 1485 

Leu His Phe Ala Thr Glu Ser Thr Pro Asp Gly Phe Ser Cys Ser Ser 
1490 1495 1500 

Ser Leu Ser Ala Leu Ser Leu Asp Glu Pro Phe lie Gin Lys Asp Val 
1505 1510 1515 1520 

Glu Leu Arg He Met Pro Pro Val Gin Glu Asn Asp Asn Gly Asn Glu 
1525 1530 1535 

Thr Glu Ser Glu Gin Pro Lys Glu Ser Asn Glu Asn Gin Glu Lys Glu 
1540 1545 1550 

Ala Glu Lys Thr He Asp Ser Glu Lys Asp Leu Leu Asp Asp Ser Asp 
1555 1560 1565 

Asp Asp Asp He Glu He Leu Glu Glu Cys He He Ser Ala Met Pro 
1570 1575 1580 

Thr Lys Ser Ser Arg Lys Ala Lys Lys Pro Ala Gin Thr Ala Ser Lys 
1585 1590 1595 1600 

Leu Pro Pro Pro Val Ala Arg Lys Pro Ser Gin Leu Pro Val Tyr Lys 
1605 1610 1615 

Leu Leu Pro Ser Gin Asn Arg Leu Gin Pro Gin Lys His Val Ser Phe 
1620 1625 1630 

Thr Pro Gly Asp Asp Met Pro Arg Val Tyr Cys Val Glu Gly Thr Pro 
1635 1640 1645 

He Asn Phe Ser Thr Ala Thr Ser Leu Ser Asp Leu Thr He Glu Ser 
1650 1655 1660 



Pro Pro Asn Glu Leu Ala Ala Gly Glu Gly Val Arg Gly Gly Ala Gin 
1665 1670 1675 1680 

Ser Gly Glu Phe Glu Lys Arg Asp Thr He Pro Thr Glu Gly Arg Ser 
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1685 1690 1695 

Thr Asp Glu Ala Gin Gly Gly Lys Thr Ser Ser Val Thr lie Pro Glu 
1700 1705 1710 

Leu Asp Asp Asn Lys Ala Glu Glu Gly Asp lie Leu Ala Glu Cys lie 
1715 1720 1725 

Asn Ser Ala Met Pro Lys Gly Lys Ser His Lys Pro Phe Arg Val Lys 
1730 1735 1740 

Lys lie Met Asp Gin Val Gin Gin Ala Ser Ala Ser Ser Ser Ala Pro 
1745 1750 1755 1760 

Asn Lys Asn Gin Leu Asp Gly Lys Lys Lys Lys Pro Thr Ser Pro Val 
1765 1770 1775 

Lys Pro lie Pro Gin Asn Thr Glu Tyr Arg Thr Arg Val Arg Lys Asn 
1780 1785 1790 

Ala Asp Ser Lys Asn Asn Leu Asn Ala Glu Arg Val Phe Ser Asp Asn 
1795 1800 1805 

Lys Asp Ser Lys Lys Gin Asn Leu Lys Asn Asn Ser Lys Asp Phe Asn 
1810 1815 1820 

Asp Lys Leu Pro Asn Asn Glu Asp Arg Val Arg Gly Ser Phe Ala Phe 
1825 1830 1835 1840 

Asp Ser Pro His His Tyr Thr Pro lie Glu Gly Thr Pro Tyr Cys Phe 
1845 1850 1855 

Ser Arg Asn Asp Ser Leu Ser Ser Leu Asp Phe Asp Asp Asp Asp Val 
1860 1865 1870 

Asp Leu Ser Arg Glu Lys Ala Glu Leu Arg Lys Ala Lys Glu Asn Lys 
1875 1880 1885 

Glu Ser Glu Ala Lys Val Thr Ser His Thr Glu Leu Thr Ser Asn Gin 
1890 1895 1900 

Gin Ser Ala Asn Lys Thr Gin Ala lie Ala Lys Gin Pro lie Asn Arg 
1905 1910 1915 1920 

Gly Gin Pro Lys Pro lie Leu Gin Lys Gin Ser Thr Phe Pro Gin Ser 
1925 1930 1935 

Ser Lys Asp lie Pro Asp Arg Gly Ala Ala Thr Asp Glu Lys Leu Gin 
1940 1945 1950 

Asn Phe Ala lie Glu Asn Thr Pro Val Cys Phe Ser His Asn Ser Ser 
1955 1960 1965 

Leu Ser Ser Leu Ser Asp lie Asp Gin Glu Asn Asn Asn Lys Glu Asn 
1970 1975 1980 
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Glu Pro lie Lys 
1985 

Lys Pro Gin Ala 



Glu Thr Glu Pro 
1990 

Ser Gly Tyr Ala 
2005 



Pro Asp Ser Gin 
1995 

Pro Lys Ser Phe 
2010 



Gly Glu Pro Ser 
2000 

His Val Glu Asp 
2015 



Thr Pro Val Cys Phe Ser Arg Asn Ser Ser Leu Ser Ser Leu Ser lie 
2020 2025 2030 

Asp Ser Glu Asp Asp Leu Leu Gin Glu Cys lie Ser Ser Ala Met Pro 
2035 2040 2045 

Lys Lys Lys Lys Pro Ser Arg Leu Lys Gly Asp Asn Glu Lys His Ser 
2050 2055 2060 

Pro Arg Asn Met Gly Gly lie Leu Gly Glu Asp Leu Thr Leu Asp Leu 
2065 2070 2075 2080 

Lys Asp lie Gin Arg Pro Asp Ser Glu His Gly Leu Ser Pro Asp Ser 
2085 2090 2095 

Glu Asn Phe Asp Trp Lys Ala lie Gin Glu Gly Ala Asn Ser lie Val 
2100 2105 2110 

Ser Ser Leu His Gin Ala Ala Ala Ala Ala Cys Leu Ser Arg Gin Ala 
2115 2120 2125 

Ser Ser Asp Ser Asp Ser lie Leu Ser Leu Lys Ser Gly lie Ser Leu 
2130 2135 2140 

Gly Ser Pro Phe His Leu Thr Pro Asp Gin Glu Glu Lys Pro Phe Thr 
2145 2150 2155 2160 

Ser Asn Lys Gly Pro Arg lie Leu Lys Pro Gly Glu Lys Ser Thr Leu 
2165 2170 2175 

Glu Thr Lys Lys lie Glu Ser Glu Ser Lys Gly lie Lys Gly Gly Lys 
2180 2185 2190 

Lys Val Tyr Lys Ser Leu lie Thr Gly Lys Val Arg Ser Asn Ser Glu 
2195 2200 2205 

lie Ser Gly Gin Met Lys Gin Pro Leu Gin Ala Asn Met Pro Ser lie 
2210 2215 2220 

Ser Arg Gly Arg Thr Met lie His lie Pro Gly Val Arg Asn Ser Ser 
2225 2230 2235 2240 

Ser Ser Thr Ser Pro Val Ser Lys Lys Gly Pro Pro Leu Lys Thr Pro 
2245 2250 2255 

Ala Ser Lys Ser Pro Ser Glu Gly Gin Thr Ala Thr Thr Ser Pro Arg 
2260 2265 2270 
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Gly Ala Lys Pro Ser Val Lys Ser Glu Leu Ser Pro Val Ala Arg Gin 
2275 2280 2285 

Thr Ser Gin lie Gly Gly Ser Ser Lys Ala Pro Ser Arg Ser Gly Ser 
2290 2295 2300 

Arg Asp Ser Thr Pro Ser Arg Pro Ala Gin Gin Pro Leu Ser Arg Pro 
2305 2310 2315 2320 

lie Gin Ser Pro Gly Arg Asn Ser lie Ser Pro Gly Arg Asn Gly lie 
2325 2330 2335 

Ser Pro Pro Asn Lys Leu Ser Gin Leu Pro Arg Thr Ser Ser Pro Ser 
2340 2345 2350 

Thr Ala Ser Thr Lys Ser Ser Gly Ser Gly Lys Met Ser Tyr Thr Ser 
2355 2360 2365 



Pro Gly Arg Gin Met Ser Gin Gin Asn Leu Thr Lys Gin Thr Gly Leu 
2370 2375 2380 

Ser Lys Asn Ala Ser Ser lie Pro Arg Ser Glu Ser Ala Ser Lys Gly 
2385 2390 2395 2400 

Leu Asn Gin Met Asn Asn Gly Asn Gly Ala Asn Lys Lys Val Glu Leu 
2405 2410 2415 

Ser Arg Met Ser Ser Thr Lys Ser Ser Gly Ser Glu Ser Asp Arg Ser 
2420 2425 2430 

Glu Arg Pro Val Leu Val Arg Gin Ser Thr Phe lie Lys Glu Ala Pro 
2435 2440 2445 

Ser Pro Thr Leu Arg Arg Lys Leu Glu Glu Ser Ala Ser Phe Glu Ser 
2450 2455 2460 

Leu Ser Pro Ser Ser Arg Pro Ala Ser Pro Thr Arg Ser Gin Ala Gin 
2465 2470 2475 2480 

Thr Pro Val Leu Ser Pro Ser Leu Pro Asp Met Ser Leu Ser Thr His 
2485 2490 2495 

Ser Ser Val Gin Ala Gly Gly Trp Arg Lys Leu Pro Pro Asn Leu- Ser 
2500 2505 2510 

Pro Thr lie Glu Tyr Asn Asp Gly Arg Pro Ala Lys Arg His Asp lie 
2515 2520 2525 

Ala Arg Ser His Ser Glu Ser Pro Ser Arg Leu Pro lie Asn Arg Ser 
2530 2535 2540 

Gly Thr Trp Lys Arg Glu His Ser Lys His Ser Ser Ser Leu Pro Arg 
2545 2550 2555 2560 



41 



Val Ser Thr Trp Arg Arg Thr Gly Ser Ser Ser Ser lie Leu Ser Ala 
2565 2570 2575 



Ser Ser Glu Ser Ser Glu Lys Ala Lys Ser Glu Asp Glu Lys His Val 
2580 2585 2590 

Asn Ser lie Ser Gly Thr Lys Gin Ser Lys Glu Asn Gin Val Ser Ala 
2595 2600 2605 

Lys Gly Thr Trp Arg Lys lie Lys Glu Asn Glu Phe Ser Pro Thr Asn 
2610 2615 2620 

Ser Thr Ser Gin Thr Val Ser Ser Gly Ala Thr Asn Gly Ala Glu Ser 
2625 2630 2635 2640 

Lys Thr Leu He Tyr Gin Met Ala Pro Ala Val Ser Lys Thr Glu Asp 
2645 2650 2655 

Val Trp Val Arg He Glu Asp Cys Pro He Asn Asn Pro Arg Ser Gly 
2660 2665 2670 

Arg Ser Pro Thr Gly Asn Thr Pro Pro Val He Asp Ser Val Ser Glu 
2675 2680 2685 

Lys Ala Asn Pro Asn He Lys Asp Ser Lys Asp Asn Gin Ala Lys Gin 
2690 2695 2700 

Asn Val Gly Asn Gly Ser Val Pro Met Arg Thr Val Gly Leu Glu Asn 
2705 2710 2715 2720 



Arg Leu Asn Ser Phe He Gin Val Asp Ala Pro Asp Gin Lys Gly Thr 
2725 2730 2735 

Glu He Lys Pro Gly Gin Asn Asn Pro Val Pro Val Ser Glu Thr Asn 
2740 2745 2750 

Glu Ser Ser He Val Glu Arg Thr Pro Phe Ser Ser Ser Ser Ser Ser 
2755 2760 2765 

Lys His Ser Ser Pro Ser Gly Thr Val Ala Ala Arg Val Thr Pro Phe 
2770 2775 2780 

Asn Tyr Asn Pro Ser Pro Arg Lys Ser Ser Ala Asp Ser Thr Ser Ala 
2785 2790 2795 2800 

Arg Pro Ser Gin He Pro Thr Pro Val Asn Asn Asn Thr Lys Lys Arg 
2805 2810 2815 

Asp Ser Lys Thr Asp Ser Thr Glu Ser Ser Gly Thr Gin Ser Pro Lys 
2820 2825 2830 

Arg His Ser Gly Ser Tyr Leu Val Thr Ser Val 
2835 2840 
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(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: ral2 (yeast) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Leu Thr Gly Ala Lys Gly Leu Gin Leu Arg Ala Leu Arg Arg lie Ala 
15 10 15 

Arg lie Glu Gin Gly Gly Thr Ala He Ser Pro Thr Ser Pro Leu 
20 25 30 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: m3 (mAChR) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 



Leu Tyr Trp Arg He Tyr Lys Glu 
1 5 

Ala Gly Leu Gin Ala Ser Gly Thr 
20 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 



Thr Glu Lys Arg Thr Lys Glu Leu 
10 15 

Glu Ala Glu Thr Glu 
25 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: MCC 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Leu Tyr Pro Asn Leu Ala Glu Glu Arg Ser Arg Trp Glu Lys Glu Leu 
15 10 15 

Ala Gly Leu Arg Glu Glu Asn Glu Ser Leu Thr Ala Met 
20 25 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTATCAAGAC TGTGACTTTT AATTGTAGTT TATCCATTTT 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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TTTAGAATTT CATGTTAATA TATTGTGTTC TTTTTAACAG 



(2) INFORMATION FOR SEQ ID NO : 13 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
GTAGATTTTA AAAAGGTGTT TTAAAATAAT TTTTTAAGCT 
(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 
AAGCAATTGT TGTATAAAAA CTTGTTTCTA TTTTATTTAG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 15 
GTAACTTTTC TTCATATAGT AAACATTGCC TTGTGTACTC 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
NNNNNNNNNN NNNGTCCCTT TTTTTAAAAA AAAAAAATAG 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
GTAAGTAACT TGGCAGTACA ACTTATTTGA AACTTTAATA 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 
ATACAAGATA TTGATACTTT TTTATTATTT GTGGTTTTAG 
(2) INFORMATION FOR SEQ ID NO: 19: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 

GTAAGTTACT TGTTTCTAAG TGATAAAACA GYGAAGAGCT 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
AATAAAAACA TAACTAATTA GGTTTCTTGT TTTATTTTAG 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 

GTTAGTAAAT TSCCTTTTTT GTTTGTGGGT ATAAAAATAG 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 0 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDMA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
ACCATTTTTG CATGTACTGA TGTTAACTCC ATCTTAACAG 40 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GTAAATAAAT TATTTTATCA TATTTTTTAA AATTATTTAA 40 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
CATGATGTTA TCTGTATTTA CCTATAGTCT AAATTATACC ATCTATAATG TGCTTAATTT 60 
TTAG 64 
(2) INFORMATION FOR SEQ ID NO: 25: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
GTAACAGAAG ATTACAAACC CTGGTCACTA ATGC CATGAC TACTTTGCTA AG 52 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GGATATTAAA GTCGTAATTT TGTTTCTAAA CTCATTTGGC CCACAG 46 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GTATGTTCTC TATAGTGTAC ATCGTAGTGC ATGTTTCAAA 40 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 56 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 8 : 
CATCATTGCT CTTCAAATAA CAAAGCATTA TGGTTTATGT TGATTTTATT TTTCAG 5 6 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GTAAGACAAA AATGTTTTTT AATGACATAG ACAATTACTG GTG 43 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TTAGATGATT GTCTTTTTCC TCTTGCCCTT TTTAAATTAG 40 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



50 



(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GTATGTTTTT ATAACATGTA TTTCTTAAGA TAGCTCAGGT ATGA 
- (2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GCTTGGCTTC AAGTTGNCTT TTTAATGATC CTCTATTCTG TATTTAATTT ACAG 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
GTACTATTTA GAATTTCACC TGTTTTTCTT TTTTCTCTTT TTCTTTGAGG CAGGGTCTCA 
CTCTG 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GCAACTAGTA TGATTTTATG TATAAATTAA TCTAAAATTG ATTAATTTCC AG 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
GTACCTTTGA AAACATTTAG TACTATAATA TGAATTTCAT GT 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CCAACTCNAA TTAGATGACC CATATTCAGA AACTTACTAG 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GTATATATAG AGTTTTATAT TACTTTTAAA GTACAGAATT CATACTCTCA AAAA 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
ATTGTGACCT TAATTTTGTG ATCTCTTGAT TTTTATTTCA G 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
TCCCCGCCTG CCGCTCTC 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
GCAGCGGCGG CTCCCGTG 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 
GTGAACGGCT CTCATGCTGC 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 
ACGTGCGGGG AGGAATGGA 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) 



MOLECULE TYPE: cDNA 



(vi) 



ORIGINAL SOURCE: 
(A) ORGANISM: Homo sapiens 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO:43: 



ATGATATCTT ACCAAATGAT ATAC 
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(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

TTATTCCTAC TTCTTCTATA CAG 23 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
TACCCATGCT GGCTCTTTTT C 21 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 
TGGGGCCATC TTGTTCCTGA 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 
ACATTAGGCA CAAAGCTTGC AA 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 
ATCAAGCTCC AGTAAGAAGG TA 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 
TGCGGCTCCT GGGTTGTTG 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 
GCCCCTTCCT TTCTGAGGAC 
(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 
TTTTCTCCTG CCTCTTACTG C 
(2) INFORMATION FOR SEQ ID NO:52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
ATGACACCCC CCATTCCCTC 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 53 
CCACTTAAAG CACATATATT TAGT 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 
GTATGGAAAA TAGTGAAGAA CC 
(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 



TTCTTAAGTC CTGTTTTTCT TTTG 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 
TTTAGAACCT TTTTTGTGTT GTG 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 
CTCAGATTAT ACACTAAGCC TAAC 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
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CATGTCTCTT ACAGTAGTAC CA 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 
AGGTCCAAGG GTAGCCAAGG 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 
TAAAAATGGA TAAACTACAA TTAAAAG 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 
AAATACAGAA TCATGTCTTG AAGT 
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(2) INFORMATION FOR SEQ ID NO: 62: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 62 

ACACCTAAAG ATGACAATTT GAG 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63 
TAACTTAGAT AGCAGTAATT TCCC 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 
ACAATAAACT GGAGTACACA AGG 
(2) INFORMATION FOR SEQ ID NO: 65: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 
ATAGGTCATT GCTTCTTGCT GAT 
(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 
TGAATTTTAA TGGATTACCT AGGT 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 
CTTTTTTTGC TTTTACTGAT TAACG 
(2) INFORMATION FOR SEQ ID NO: 68: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 
TGTAATTCAT TTTATTCCTA ATAGCTC 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 
GGTAGCCATA GTATGATTAT TTCT 
(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 
CTACCTATTT TTATACCCAC AAAC 



(2) INFORMATION FOR SEQ ID NO: 71: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 
AAGAAAGCCT ACACCATTTT TGC 
(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 
GATCATTCTT AGAACCATCT TGC 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73 

ACCTATAGTC TAAATTATAC CATC 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74 
GTCATGGCAT TAGTGAC CAG 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
AGTCGTAATT TTGTTTCTAA ACTC 
(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76 
TGAAGGACTC GGATTTCACG C 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 
TCATTCACTC ACAGCCTGAT GAC 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78 
GCTTTGAAAC ATGCACTACG AT 
(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79 
AAACATCATT GCTCTTCAAA TAAC 
(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80 

TACCATGATT TAAAAATCCA CCAG 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81 
GATGATTGTC TTTTTCCTCT TGC 
(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82 
CTGAGCTATC TTAAGAAATA CATG 
(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 
TTTTAAATGA TCCTCTATTC TGTAT 
■ (2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 
ACAGAGTCAG ACCCTGCCTC AAAG 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 
TTTCTATTCT TACTGCTAGC ATT 
(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86 
ATACACAGGT AAGAAATTAG GA 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87 
TAGATGACCC ATATTCTGTT TC 
(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88 
CAATTAGGTC TTTTTGAGAG TA 
(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89 
GTTACTGCAT ACACATTGTG AC 
(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90 
GCTTTTTGTT TC C TAACATG AAG 
(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 
TCTCCCACAG GTAATACTCC C 
(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92 



GCTAGAACTG AATGGGGTAC G 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93 
CAGGACAAAA TAATCCTGTC CC 
(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 
ATTTTCTTAG TTTCATTCTT CCTC 
(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 
AGAAGGATCC CTTGTGCAGT GTGGA 
(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96 
GACAGGATCC TGAAGCTGAG TTTG 
(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97 
TCAGAAAGTG CTGAAGAG 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98 
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GGAATAATTA GGTCTCCAA 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
GCAAATCCTA AGAGAGAACA A 
(2) INFORMATION FOR SEQ ID NO : 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 100 
GATGGCAAGC TTGAGCCAG 

(2) INFORMATION FOR SEQ ID NO : 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101 
GTTCCAGCAG TGTCACAG 
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(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 
GGGAGATTTC GCTCCTGA 
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